Although the UK is a crowded island sitting on the border of the Atlantic Ocean and the North Sea, comfortably far from mainland Europe, its population is relatively small; in fact when compared to the those of China or India it is, in fact, tiny (estimated to be 60.7 million as of July 2007). Perhaps it is this fact that means I am always surprised when the UK does well at global scale sporting events such as the Olympics, and there is no doubt that Team GB (more correctly Great Britain and Northern Ireland) has excelled at these games, putting in the best performance, in terms of medal haul, since the 1908 games.
So what has all this to do with XPO? Well I like to keep up with what is going on at the Olympics but that “work thing” just keeps getting in the way. Coupled with that I’d really like to record all the news so that I can go back and analyse it after the event. So, what I really need is a little application to handle the following requirements:
- Read an XML file and fetch one or more RSS feed URLs
- For each feed
- Check if that feed has been serialised previously
- If it hasn’t
- Create it
- If it has then fetch the matching feed object from the database
- Send a HEAD request to the web server for the RSS feed file header information
- Examine the last-modified header
- If there is no last-modified header then skip this file
- If the file has not been updated since last read then skip the file
- Fetch all the news items from the RSS feed
- For each news item
- Create a news item object
- Check if the fetched / created feed object has this news item already
- If it does then skip this news item
- Otherwise add the news item to the feed
- Set the lastUpdate date to now.
- Serialise the newly created object graph to the database
- This process must be able to be automated using the Windows AT command
Looking at the requirements, it’s clear to see that, apart from doing a few sensible checks, all that is required is that an RSS feed is serialised to a database, that sounds to me like a perfect job for XPO. Now, a blog posting is not a suitable place to examine such an application line by line (I leave that as an exercise for the reader) but let’s look at some of the more pertinent points of the application.
Firstly I could have used an OPML file to contain my list of RSS feed URLs that is, after all, what it was designed for. However, I felt that was a little overkill for the purposes of our application and so I went with a more simplified “roll you own” file structure as shown below.
<?xml version="1.0" encoding="utf-8" ?>
<Feeds>
<Feed>
<Name>Feed Name</Name>
<Url>Feed URL</Url>
</Feed>
</Feeds>
Having specified which feeds we wish to read (my application reads the 32 BBC feeds dedicated to the Olympics) we should now go ahead and specify our Feed and NewsItem domain objects. I’m not going to cover that here as it is clearly documented on our web site.
The first thing I’d like to draw your attention to is the fact that, by default, XPO will serialise objects to an Access database, if you want to use another database (and we support over 15 RDBMSs) then you have to inform XPO like so:
private void SetUpXPODataLayer() {
string conn =
MSSqlConnectionProvider.GetConnectionString(".",
"XPOlympics");
XpoDefault.DataLayer =
XpoDefault.GetDataLayer(conn,
AutoCreateOption.DatabaseAndSchema);
}
The next thing I want to draw your attention to is not specifically XPO related but is, never the less, interesting. It is the technique of only fetching the header information from the web server to ascertain whether or not a file has been updated, this prevents you having to download the entire file unnecessarily and is done like this:
//GS - Make a request for the feed header info only and
//get the last-modified date
HttpWebRequest webRequest =
HttpWebRequest.Create(feed.Url) as HttpWebRequest;
webRequest.Method = "HEAD";
webRequest.KeepAlive = false;
WebResponse webResponse = null;
string lastModifiedDateString = String.Empty;
try {
webResponse = webRequest.GetResponse();
lastModifiedDateString =
webResponse.Headers.Get("Last-Modified");
}
catch (WebException wex) {
Trace.WriteLine(String.Format("{0} - {1} {2}",
DateTime.Now, wex.Message, feed.Name));
Trace.Flush();
Trace.Close();
continue;
}
finally {
if (webResponse != null) {
webResponse.Close();
}
}
Note the use of the “KeepAlive = false;” line above, this prevents the HttpWebRequest object from keeping the connection open and causing “connection leaks”.
Whilst I’m on the subject of things which are not specifically XPO related but are interesting anyway, note the use of LINQ to XML to extract the news items from the feed, isn’t that syntax so much nicer that the XPath way of doing things?
var newsItems =
from item in feedDoc.Descendants("item")
select new {
Title = item.Element("title").Value,
Description = item.Element("description").Value,
PubDate = item.Element("pubDate").Value,
Category = item.Element("category").Value,
Url = item.Element("link").Value
};
Anyway, this is an XPO blog so let’s get back on topic shall we? Having fetched or created a new feed and extracted the news items from the RSS document we want to add the news item to the feed, but only if it hasn’t been added before, we don’t want duplicate news items in our feed. We can use XPO to check it’s not been added already this way:
if (!aFeed.NewsItems.Contains<NewsItem>(aNewsItem)) {
aFeed.NewsItems.Add(aNewsItem);
}
Now, all that remains for us to do is to serialise our newly created object graph to the database by calling CommitChanges() on the UnitOfWork object.
If you would like to download the full application and have a look at it, you can get it here.
Technorati tags:
XPO,
RSS,
Olympics