/* You Are Expected to Understand This */

The Art of Commenting

This article originally appeared as a freshmeat editorial.

Rob Pike, in his essay "Notes on Programming in C", says that he tends to err on the side of commenting less, rather than more. There appears to be a school of thought that has taken this one step further, and believes that comments are at best a necessary evil, and that good code should be self-evident enough to obviate the need for comments.

Bull.

Yes, you should be writing code that's clean enough that you don't need to explain what it does. But the code only tells you what the code does; it doesn't tell you what the code was intended to do, what it ought to do, what it doesn't do, or why it looks the way it does.

Bad Comments

The canonical example of a useless comment is:

i++;		/* Increment i */

What makes this a bad comment? Quite simply the fact that it doesn't add to the reader's understanding of what's going on. Compare this with:

for (i = 0; i < num_elements; i++)
{
	frob(elements[i]);
	if (elements[i] == 0)
		i++;		/* Ignore the next element */
}

In the first case, the comment tells you exactly what the code does, but you knew that already from reading the code. In the second case, however, the comment tells you, in human terms, what the statement is intended to accomplish. This is a minor, but crucial difference: the code tells you what the code does; the comment tells you what the code is supposed to do. If you wanted to change the code so that it used a linked list rather than an array, you would know how to translate that statement:

for (elem = elements; elem != NULL; elem = elem->next)
{
	frob(elem->data);
	if (elem->data == 0)
		elem = elem->next;	/* Ignore the next element */
}
Update, Nov. 5, 2005: Grumpy Old Steve puts it well:
The excuse for [the lack of comments in the Linux kernel] is that well written code is its own explanation... well, weenieboy, if I'm in there fixing the code, it's probably because it doesn't do what it's supposed to.

Comments as Section Headers

Take a look at a good reference book. If you wanted to use this book to answer a question, you might start by looking up a key word or two in the index, or by finding a promising chapter in the table of contents. Then you would leaf through the chapter, reading section titles and table captions, until you found a page that was likely to hold the answer to your question; then you would start reading the actual text. Without the section headers, it would take much longer to find the part that you are interested in.

In order for a program to serve as its own reference manual, it should contain chapter and section headings, comments that briefly say what the code that follows does. This allows the reader to skim the comments and skip to the part that he's interested in.

Error-Checking

Real code does a lot of work unrelated to its primary task, such as error-checking, assertion-checking, context-sensitive help, and so forth. This can add significantly to the length and apparent complexity of your code. For instance, in ColdSync, a single printf() statement grew to over 17 lines of code once error-checking had been added.

In situations like these, it is especially important to leave section-header comments, lest the reader lose sight of the forest for the trees. Clearly mark what is important and what is incidental.

Write Comments First

Perhaps the easiest way to make sure that your code has useful section header comments is to write them first: before you write any actual code, write, in comments, an outline saying what the code will do:

int
authenticate()
{
	/* Find out which authentication method to use */
	/* If it's a network connection, authenticate host */
	/* Prompt user for password */
	/* Verify supplied password */
	/* If it doesn't match, raise the alarm */
}

Then, when you actually flesh this function out with code, your outline comments automatically become section header comments.

Writing such an outline carries an additional benefit: it allows you to catch, at an early stage, problems in the design itself. The human brain is a wonderful thing, but it is also a result of three billion years' worth of "good enough" implementation, and backwards-compatibility back to the earliest chordates. As a result, it is a giant hack with more than a few quirks.

One of these quirks is that the different parts of the brain don't always work together. You may have experienced "confessional debugging", in which you ask a coworker for help with a problem, but the act of articulating the problem into words or drawing a graph on a whiteboard suggests a solution. The part of your brain that sees the code is only a few neurons away from the part that can fix the problem, but the shortest path between them often leads through the speech centers, out of your mouth and back in through your ears.

Writing an outline is a variant on confessional debugging: by writing a compact, high-level summary of a piece of code, you help ensure that the design is good, and that you haven't left anything out.

XXX

Oftentimes, when the creative juices are flowing and you're churning out code as fast as you can type, you'll think of something that needs to be done in the production release of the code, or in the next version, or just a nifty feature that it would be nice to have.

In these cases, stop and add an XXX (or FIXME) comment:

fd = open("myfile", O_RDONLY);
        /* XXX - Error-checking */

For one thing, this tells anyone reading your code that it's still unstable, and also points out where the known problems are. For another, if you don't mark the problems now, they'll be a lot harder to find a week or a year from now, when you're ready to revisit your old code.

Open Source Projects

If you are writing an open source project, you expect people to look at your code, submit patches, and generally help you to improve the project. You should help these contributors. One of the most damning criticisms of the Mozilla project was that it was very hard to find one's way around it. Don't make the same mistake.

Why would anyone even consider contributing to your project? In most cases, people just want to make one or two small, specific changes: perhaps they want to fix an annoying core dump or add a useful command line option. They want to find the relevant section, fix the problem, and submit a patch. Get in quickly, get out in less than an hour.

You should encourage these people. How? By making it easy to find the problem spot quickly. How? Clean code is a must, of course, but good commenting practices can also make the task much easier.

Section-header comments (see above) tell the reader what a given passage does, and allow him to get a feel for the layout of the project before diving into the code itself. Cross-references, e.g.

int
parse_line(FILE *infile)
        /* This is used by (*parser->lang)() */
help to show how things are connected.

The bottom line is that if your code is too hard to read, people will find it easier to a) do nothing, b) submit a bug report and expect you to fix it, or c) switch to some other program that does the same thing as yours but is easier to hack.

Objections

The main objection to copious commenting, which Mr. Pike raises as well, is that if the comments and code repeat each other, you run the risk that comments and code will drift until they no longer bear any relation. Hence, some conclude, it is better not to comment.

To me, this sounds like saying that a highway construction crew might forget to update the road signs when necessary, so therefore there shouldn't be any highway signs. Of course this is a risk, but in the vast majority of cases, it is nice to have signs that say where the road goes, and roads that go where the signs say.

If you use tables of data in your program, you still have to maintain them, even though the compiler can't tell you whether they're wrong. They're just as much a part of your program as the actual code. So it is with comments. Don't just maintain code. Maintain code and comments.

Commenting Out Code

Just for completeness, I'll also point out that commenting out code is a good way to temporarily delete code that you'll want to return to later. I'll only point out that Perl- or C++-style comments allow you to have comments inside comments:

# This is a comment
# <old code> # old comment
whereas C-style comments do not:
/*
This is a comment
<old code> /* Old comment */
this is not commented
*/

Consider this the next time you're designing a language.

Conclusion

Good commenting practices can make code cleaner, easier to understand, easier to debug, and generally more fun to hack. And isn't that why we write code in the first place?


Author's bio: Andrew Arensburger has been writing Open Source code since the 1980s, before he knew what it was. In the daytime, he plays a mild-mannered system administrator to support his reading habit. He can be reached at arensb+freshmeat@ooblick.com.