On code examples - mike's web log/comments

New comments for this entry are disabled.

April 04, 2006 | On code examples | 8844 hit(s)

In his blog, Karl Seguin has a thoughtful rant (is that an oxymoron?) about code examples in documentation. He shows some examples from (whew) the Macromedia documentation for .NET.[1]

He finds some examples of particularly egregious bad practices, including:

Insecure code.
Poor error handling.
Non-meaningful variable names.
Inefficient and inelegant code.

In a follow-on comment, he notes one point in particular:

One concern I had with the Macromedia stuff is I think (and I could totally be wrong) that there'll be a high percentage of copy/pasting. This is material focused towards developers not primarily focused on .NET (or similar languages). They don't know that it's wrong/incomplete. They want to get their flash app working and taking the sample to modify the connection string and select statement might be all the work they'll put into it. Why should they do any more? This does come from Macromedia after all.

We've rassled with a lot of these issues. When we write code examples, there is a constant tension, let's call it, between functionality and readability. We very much bear in mind Karl's point about copy-n-paste, one of the behaviors that defines the so-called Mort persona.[2]

Consider database access. We went round and round about how to illustrate this. To accommodate our Morts, we would ideally offer nearly copy-n-paste examples. In that case, it would be easiest to show a) an explicit connection string, and b) explicit credentials. That would minimize the number of changes that the Mort developer would have to make, and it would reduce the cognitive burden required to get the example working.

But these are bad, bad practices. Bad. So we opted for good practice, and upped the work that Mort would have to do to get things running. These days, we show connection strings only when that's somehow essential to the example, and then we show only integrated security. What we mostly do is advise people to keep their connection strings in the config file, which we illustrate. We also somewhat monotonously keep pounding on the message that they should encrypt their connection strings. Doing these non-illustrated tasks is, as they say, left as an exercise for the reader. (We link, of course.) But not doing it -- showing stuff explicitly -- is bad pedagogy.

We also look for places where we illustrate user input, typically with a TextBox control. This one is more subtle. We could always show using Server.HtmlEncode to encode user input. But there are a couple of arguments against doing this always. One, you don't always have to do it. If the example just echoes user input back to the user, there's no point in encoding it. (Yeah, I know -- hold that thought.) Two, if we use it when it's not needed, we clutter the example and perhaps raise questions in the reader's mind about why we're doing it. Three, by default, ASP.NET pages have ValidateRequest turned on, so (at least by default), Mort has a safety net. What we've done is show Server.HtmlEncode in places where you'd obviously use it always. In situations where it doesn't seem mandated, we add a security note (example) that is a heads-up to the reader -- "Hey, this is user input, which you shouldn't trust. Go read about it here."

We've had similar discussions about the use of try-catch and about illustrating exception handling. For example, just how granular is the exception handling in an example? (Catch every exception? Do we unpack possible inner exceptions?) What does the example code in a given catch block do? Although we want to illustrate some sort of non-dumb exception handling, we have to assume that at some point the reader takes over and (we hope) adds some sort of application-appropriate code.

The actual mechanics of creating and vetting code examples have gotten better since the early days of .NET. These days, almost all the code examples are in separate files that are run through FxCop and are compiled (Hey, we can break builds, too!), and are injected into the topics only when we build the docs. Aside from the obvious advantages, this gives us some nice reuse.

I would hardly claim that the code examples are uniformly great. I'm sure people would be delighted to send me all sorts of questionable stuff. (Although pace Kent's comment in Karl's post, it should be pretty darn hard at this point to come up with sightings of username="sa" password="".) But making code examples better, and adding more of them, was a major goal for Whidbey.

As for elegance, that's a tough one. The ASP.NET writers are all coders, some of them quite experienced. Certainly in newer code samples, the lameness factor should be pretty low. In some cases, there are probably examples where we compromised something for readability, perhaps just to keep the example to a reasonable size. There are some very good examples -- our man Doug wrote up a series of example topics for creating custom providers (membership, roles, etc.) which are not just highly functional, but IMO really excellent example code as well. There are many other such as well.

We're also doing our darndest these days to come up with examples that are non-trivial. A pattern that we try to follow for the reference (API) docs is to have one or more in-depth examples in the class overviews, and then extract relevant excerpts for the individual member topics. Even so, the task can be daunting. Imagine trying to come up with code examples to illustrate the many capabilities of, say, the GridView control.

One final comment about variable names. This was another discussion. For the most part, we try to follow .NET coding conventions. When we show ASP.NET controls, we actually use the convention followed by Visual Studio when it creates controls -- Button1, Label2, etc. Theory is this will be consistent with what users will see when working in the product. As I say, that's the theory. One thing I'm reasonably confident about is that you won't find many (any?) variables named foo and bar. :-)

[1] Which he describes using the linguistically interesting word craptastically. If you're a language person, see Benjamin Zimmer's piece on the Language Log about "cran-morphing."

[2] Personas should not be confused with persons.

Jeff Atwood 05 Apr 06 - 12:12 AM

It's a tough problem. The smart solution is to let everyone else do the work by making your docs a Wiki.

You don't like our sample code? Well, pally, write some better sample code yourself!

Of course that takes active moderation, but it'd be far more efficient in the long run.. and many hands makes light work, right?

mike 05 Apr 06 - 12:33 AM

Many hands makes work light -- hey, why don't we have everyone contribute to the .NET Framework? Don't like our implementation of the Mail class? Write your own!

Jeff, does Vertigo release their docs open source? Who does?

Karl 05 Apr 06 - 4:59 AM

It's funny, while I've bitched about the problem, I knew I didn't offer any solutions. Personally, I think diligence is the big factors, but Jeff's suggestion should obviously be taken into consideration. I also use FxCop extensively and make sure my code builds properly (although I think my C# Ajax.net piece that's up on MSDN doesn't compile :( ), which is the barest minimum I apply to my actual code - which should my documentation be any different?

I'n an upcoming post, I'll be talking about exception handling, and this is bound to come up: http://msdn2.microsoft.com/en-us/library/24395wz3.aspx. While the piece is pretty short, this comment really strikes me as wrong "It is preferable to use Try/Catch blocks around any code that is subject to errors rather than rely on a global error handler." First, you shoudln't catch an exception unless you can actually handle it. Secondly, you're better off relying on global error handling when it comes to outputting friendly user messages:

You don't want a program where in 100 different places you handle exceptions and
pop up error dialogs. What if you want to change the way you put up that dialog box? That's just terrible. The exception handling should be centralized, and you should just protect yourself as the exceptions propagate out to the handler.

- from Hejlsberg

Joshua Flanagan 05 Apr 06 - 6:38 AM

ie7 says your site is suspected as "suspicious" - a possible phishing site...

mike 05 Apr 06 - 8:25 AM

Joshua -- thanks for the heads up. My site uses dynamic DNS; my actual IP belongs to Comcast. I strongly suspect that large blocks of Comcast IPs that have been flagged for phishing. I hope that it's clear that my blog is not an actual phising site. :-)

Anyway, I'll see about contacting MS and explaining myself as best I can.