New blogfeed - mike's web log/comments

New comments for this entry are disabled.

July 10, 2006 | New blogfeed | 3832 hit(s)

As I noted a little while ago, I thought I should convert my blog feed from a Web service (BlogFeed.asmx/GetFeed?) to an HttpHandler ... and so I have. You can now get the feed using BlogFeed.rss, including the feeds for individual categories if you want (e.g. BlogFeed.rss?category=aspnet). You can also get the full feed instead of the truncated feed using the flag I created for the Web service -- BlogFeed.rss?full=true, BlogFeed.rss?category=aspnet&full=true. As with the comment feed, the .rss extension is mapped to a handler class in App_Code.

Converting from the Web service to a handler was not hard. In fact, it took me probably 15 minutes to do the basic conversion. Other than the small differences between a Web service and a handler class, which Visual Studio basically did for me, the logic was the same. The only real functionality to address was that I had to change the logic so that instead of returning an XmlElement object, the handler called context.Response.Write and passed a string with the feed contents.

What I spent rather more time on was two additional features, one that I thought was important and one that was merely interesting. The important one was caching. In the Web service, I could get caching merely by including an attribute in the method:

<WebMethod(CacheDuration:=3600)> Public Function GetRSS() As XmlNode

I don't know of a way to get caching for free like this in a handler, so I ended up writing my own. Caching can be kinda tricky, eh? There are docs for what I wanted, namely to expire the cached data absolutely after 4 hours. (Configurable in Web.config.) What I needed to learn first-hand, though, was that the cache key had to accommodate all possible feeds that might be requested. After the first set of mistakes, I learned to add the category to the cache key so that "normal" and category-specific feeds did not stomp on each other in the cache, oops. Then I learned that I need to add the "full" flag to the cache key, too, since that was another vary-by parameter, so to speak. In the end, the logic came out like this:

Dim blogFeedCacheEntryName As String
If truncateFeedOverride = True Then
   blogFeedCacheEntryName = "blogFeedString" & "_" & category & "_" & "full"
Else
   blogFeedCacheEntryName = "blogFeedString" & "_" & category
End If

If HttpContext.Current.Cache(blogFeedCacheEntryName) Is Nothing Then
   responseString = BuildRSSFeed(GetBlogData(category), context)
   blogFeedCacheDuration = _
      CInt( ConfigurationManager.AppSettings("blogCacheDuration") )
   HttpContext.Current.Cache.Insert(blogFeedCacheEntryName, _
      responseString, _
      Nothing, _
      DateTime.Now.AddMinutes(blogFeedCacheDuration), _
      System.Web.Caching.Cache.NoSlidingExpiration)
Else
   responseString = HttpContext.Current.Cache(blogFeedCacheEntryName)
End If

The optional-but-interesting aspect of the task was in one sense the motivation for converting from a Web service to a handler -- I wanted to add a style sheet to the feed. (As I noted before, it's not straightforward to add a style sheet to XML output from a Web service.) I added a style sheet link to the XML skeleton on which I hang the feed:

<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="./rss/rssfeed.css"?>

<rss version="2.0"  ... many namespaces here ...>  
</rss>

Quickly enough, however, I discovered that although I could make various strings look pretty, I didn't have all that much control over the layout (at least, not with my current knowledge of CSS). Solution: an XSLT transform, yes?

Yes. Back to W3Schools.com for me. Once you've got the path and the loop stuff sorted out, twiddling your RSS feed with an .xsl is great -- you can turn those RSS elements into whatever you want. In my case, that was a lot of <p> tags. For the blog entry content, I used a <div> tag; using a <p> tag for content that can contain other containers (like <blockquote>) makes IE very, very angry, as I learned after way too much time spent screwing with the transform. (I'll note here that debugging XLST is, like, impossible, right? If you're lucky, the browser points to a specific location in the XML file and whines. If you're not lucky, it just gives up and says something about as helpful as "Error parsing file.")

Still one problem, though. When I store my blog entries, I HTML-encode them, so they contain all sorts of encoded <br /> and <blockquote> and <ul> tags, stuff like that. The transform did not unencode these. I actually found a page that showed how to use <xsl:template> elements to do some complicated-looking parsing and string substitution. I gave it a shot, but never got a single example of using a template to work. (See earlier note about XSLT debugging.)

But I do know how to do this sort of thing in JavaScript. I added some JavaScript to the page. No dice -- XSLT parser didn't like the actual code. (There might be some way to embed it as CDATA, but I couldn't get that to work. Commenting it didn't work either.) But -- haha! -- I could insert it as a .js file into a <script> tag, and XSLT liked that ok. The JavaScript runs after the transformation, doing its magic on the result of the transform. A somewhat edited version of the .xsl file looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/" 
   xmlns:wfw="http://wellformedweb.org/CommentAPI/">

<xsl:template match="/">
<html>
<style>
body { }
<!-- tons of style info here -->
</style>
<head>
<script language="javascript" src="./rss/rssfeed.js"></script>
</head>
<body onload="UnencodeHTML()">
<h1>mike's web blog/rss feed</h1>
<xsl:for-each select="rss/channel/item">
   <div class="blogentry">
   <p class="title"><xsl:value-of select="title"/></p>
   <p class="pubDate"><xsl:value-of select="pubDate"/></p>
   <p class="categories">
      Categories: 
        <span class="categories"><xsl:value-of select="category"/></span>
   </p>
   <div class="description"><xsl:value-of select="description"/></div>
   <p>
      Link: 
      <a href="http://###"><xsl:value-of select="link"/></a>
   </p>
   <p class="comments">
      Comments: 
      <xsl:value-of select="slash:comments"/>
         (<a href="http://###"><xsl:value-of select="wfw:comment" /></a>)
   </p>
   </div>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

The final thing was to make the blog and comment-feed links hot. Notice the attributes href="http://###" in the transform. I used some JavaScript to walk through all the <a> tags in the doc and if the href attribute was this special string, I made the link text into the href. The JavaScript looks like this:

function UnencodeHTML()
{
   var allDivTags = document.getElementsByTagName("div");

   // For every <div> tag whose class==description, this decodes 
   // encoded HTML for "real" HTML by replacing &lt; with <, etc. 
   for(i=0 ; i < allDivTags.length ; i++)
   {
      if(allDivTags[i].className == "description")
      {
         textToReplace = allDivTags[i].innerHTML;
         try
         {
           while(textToReplace.indexOf("&lt;") > -1)
             { textToReplace = textToReplace.replace("&lt;", "<"); }
           while(textToReplace.indexOf("&gt;") > -1)
             { textToReplace = textToReplace.replace("&gt;", ">"); }
           while(textToReplace.indexOf("&amp;") > -1)
             { textToReplace = textToReplace.replace("&amp;", "&"); }
           allDivTags[i].innerHTML = textToReplace;
         }
         catch(err)   { ; }
      }   
   }

   // Walks all the <a> tags and finds those with the special
   //   placeholder "http://###" (note that the browser
   // adds the extra slash. Substitutes the link text into the href.
   var allATags = document.getElementsByTagName("a");
   for(i=0 ;  i < allATags.length ; i++)
   {
      if(allATags[i].href == "http:///###")
      {
         allATags[i].href = allATags[i].innerHTML;
      }   
   }
}

In the RSS skeleton, I changed the style sheet link from .css to .xsl, and that was it. The RSS feed is in a nice human-readable format. In fact, something I hadn't set out to do, but have managed anyway, is to recast the blog output as XML + XSLT, which a lot of people will tell you is the right way to do that thing. Of course, none of the other junk on my blog is displayed. Hmmm. I might have undermined my own blog display here. :-)