I finally got DasBLog up and running on Blobservations and decided that I wanted to import all of my old content from Blogger.com. I figured that this challenge had to have been overcome in the past, so I hit MSN Search and Google with keywords like: Import Export DasBlog Blogger. I located a few half-baked tools (as-in, not quite ready for prime-time). Some didn’t support titles, other wouldn’t grab the links. Most of them could easily blame their failings on the fact that they were attempting to use the underpowered Blogger API.
I hacked around a little, trying to implement a simple ATOM API client in C#, and to do a fully automated conversion tool, but eventually I decided to drop back and punt. I only needed to do this one time, so a little manual intervention would be acceptable.
I ended up following these steps:
1. Change Blogger.com settings, under “Formatting” tell it to show 999 days on the front page. DO NOT PUBLISH
2. Modify your blogger template to:
<?xml encoding=”utf-8″ ?>
<entries>
<Blogger>
<bi_url><![CDATA[<$BlogItemURL$>]]></bi_url>
<bi_title><![CDATA[<$BlogItemTitle$>]]></bi_title>
<bi_body><![CDATA[<$BlogItemBody$>]]></bi_body>
<bi_author><![CDATA[<$BlogItemAuthorNickname$>]]></bi_author>
<bi_date><![CDATA[<$BlogItemDateTime$>]]></bi_date>
</Blogger>
</entries>
3. DO NOT PUBLISH, instead hit the “Preview” button
4. In the resulting screen, you’ll see a bunch of unformatted text. Select “View Source” and then copy everything from the opening <? xml ?> tag down to the closing <entries> tag. Paste that text into notepad and save as archive.xml. On the edit screen, hit the “Discard Edits” button. Go back and restore the settings from step 1.
5. Warning, very rough C# code ahead. The following is snipped from the C# program I threw together to convert from the raw xml to Dasblog. It uses some classes from the DasBlog engine, so you’ll have to add a reference to the newtelligence.DasBlog.Runtime.dll file if you want to try this yourself. Before running, make sure that the directory c:content exists.
{
DataSet ds = new DataSet(“archive”);
ds.ReadXml(@”{insert directory to archive.xml file}archive.xml”);
string title=””;
string body=””;
string link=””;
string date=””;
string auth=””;
DateTime dt_post;
newtelligence.DasBlog.Runtime.IBlogDataService das_ds = newtelligence.DasBlog.Runtime.BlogDataServiceFactory.GetService(@”c:content”, null);
foreach(DataRow r in ds.Tables[0].Rows)
{
title=(string)r[“bi_title”];
body=(string)r[“bi_body”];
link=(string)r[“bi_url”];
date=(string)r[“bi_date”];
auth=(string)r[“bi_author”];
dt_post=DateTime.Parse(date);
Entry post = new Entry();
post.Author = auth;
post.Content = body;
post.Description = “”;
post.Title = title;
post.CreatedLocalTime = dt_post;
post.CreatedUtc = dt_post.ToUniversalTime();
post.ModifiedLocalTime = dt_post;
post.ModifiedUtc = dt_post.ToUniversalTime();
post.EntryId = Guid.NewGuid().ToString();
das_ds.SaveEntry(post);
}
}
6. After running, you should have a bunch of xml file in the directory c:content. Just upload or copy these to DasBlog’s “content” directory and the posts should show up under DasBlog. I think I may have had to create and delete a new post to make them show up.
If you have any questions, feel free to email.
Update 6/16/2005: Ryan Jones has implemented a dotText to DasBlog and Livejournal to DasBlog content conversion using similar code.
Update 7/13/2005: Ben Scheirman emailed me about some difficulties he was having, and we figured out that you have to have the Timestamp Format configured a certain way for the blogger template to work. This image shows the relevent setting:
Ben is also putting together a GUI to assist with the conversion. I’ll post a link when he finishes it up.
Hi, thanks for the code. I was beating my head against the wall trying to figure out how I was going to do this. You saved me a bunch of time 🙂 In fact, last night it only took me 40 minutes from code to completion to get my old posts out of my database and into dasBlog.
Jeff,
I’m glad it was of some help! I found the same thing that you did when I first attacked the problem: Lots of partial solutions, nothing that did exactly what I needed. Maybe someone can find some time to automate this.
Rick H.
Just thought I’d let you know (and hopefully save someone a little effort) that you have your xml malformed above… there cannot be a space after the ? in the xml declaration
Anyway,
Thanks for the tip! I wrote a little GUI that will select the xml file and run your little function there, but maybe we should have it interact with the blogger web services… whaddya think?
Ben,
Thanks for the tips. I’ll edit the post to reflect the change. Also, if you want to share that GUI you wrote, let me know and I’ll put a link in the body of the post.
Rick H.
Thanks for the great tips! however, the code won’t work with me, and I stepped into it and changed a little of the code and then it worked. Didn’t spent too much time on it.
Also, I have to modify the permissions of the xml files so that the DasBlog can read it after I copied over to the content directories.
There are a couple of things to note with the code and the date formatting. For users with non-US locale settings, the porting code will fail with a date parsing problem. The quick fix is to set the current threads culture a new CultureInfo object set to en-US.
Secondly, I had issues with reading the XML. I needed to add an extra element (I called it bi_Table) around the Blogger element, and add the XmlReadMode.InferXMLSchema to the ReadXml() method call.
Other than that, it worked great. Thanks for the quick fix, it really saved me bucket loads of time.
The post was really helpful in porting the content from Blogger. I ran into a problem, but was able to quickly fix it by modifying the Blogger template. Please see the post if you are interested in the details.
http://www.vineethraja.com/…/PortingContentF