Followers

Wednesday, July 30, 2008

No, your code is not so great that it doesn’t need comments

Posted in Software Development by Dan

Code-commenting is so basic and so universal that every programmer, regardless of the language that they practise, thinks that they know all there is to know and that their way is the only sensible approach (I am no different in this respect). I guess that’s why there are so many blog postings offering advice on commenting (you can add this one to the list).

Even the elite of programmer bloggers are having their say. Steve Yegge covered it and, more recently, so did Jeff Attwood. Jeff’s basic advice, that you wouldn’t need so many comments if you wrote the code to be more self-explanatory, is sound but the idea that we should be aiming for some kind of perfect code that has no need for any comments is dangerous.

It’s not a sensible goal for beginners and inexperienced developers. Tell them that they should write good code without any comments and they will deliver on the second part but struggle with the first. Even among experienced developers, assuming for a moment that it is possible to write perfect code that doesn’t require comments, there will be far fewer who are capable of this than there are who think that they are.

The other arguments against commenting are even weaker in my opinion. Yes, poor comments are …well… poor. So don’t write poor comments, write good ones. And yes, if comments become out-of-sync with the code then they are not helpful. So don’t let the comments become out-of-sync, they are part of your code and should be maintained/refactored along with the code itself.

I don’t believe that I’ve read a piece of code and thought “wow, this has far too many comments”. Unfortunately, I’ve had the opposite reaction all too often. I don’t for one moment believe that it is possible to write quality code without any comments. Take Jeff’s own example:

Here’s some code with no comments whatsoever:

r = n / 2;
while ( abs( r - (n/r) ) > t ) {
r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

Any idea what that bit of code does? It’s perfectly readable, but what the heck does it do?

Let’s add a comment.

// square root of n with Newton-Raphson approximation
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

That must be what I was getting at, right? Some sort of pleasant, middle-of-the-road compromise between the two polar extremes of no comments whatsoever and carefully formatted epic poems every second line of code?

Not exactly. Rather than add a comment, I’d refactor to this:

private double SquareRootApproximation(n) {
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
r = 0.5 * ( r + (n/r) );
}
return r;
}
System.out.println( "r = " + SquareRootApproximation(r) );

I haven’t added a single comment, and yet this mysterious bit of code is now perfectly understandable.

Sorry Jeff, but that’s not “perfectly understandable”. I agree with extracting the square root code into a separate method with an appropriate name, but your second version (the one with the comment) was more informative since it mentioned which algorithm you were using (in your version, the maintainer is going to have to figure that out for themselves). Also, we’re still left with at least two poorly-named variables. We can forgive the use of n for the parameter since that’s kind of a convention but what the hell are r and t?

In my opinion, this is better:

/**
* Approximate the square root of n, to within the specified tolerance,
* using the Newton-Raphson method.

Original here
*/

private double approximateSquareRoot(double n, double tolerance)
{
double root = n / 2;
while (abs(root - (n / root)) > tolerance)
{
root = 0.5 * (root + (n / root));
}
return root;
}

Alternatively, if you don’t like the verbose comment at the top, you could either rename the method to something like newtonRaphsonSquareRoot (if you are happy for the method name to be tied to the implementation) or put an inline comment in the body explaining that this is the Newton-Raphson method. Any of the three variations will communicate useful extra information to the maintenance programmer, who can then Google “Newton-Raphson” if they want to find out more about it. Remember that code is written only once but read many times. It should be tailored for the reader rather than the writer.

This is all very well, but we’re still lacking some information. Why the hell is Jeff calculating square roots in this way? Why is he not using the library function? Is it because he doesn’t like the answers it gives him? Is it for performance? Who knows?

Well-written code will often answer the “what?” and “how?” questions with few or no comments, but you often also need to answer the “why?” question too. Avi Pilosof covers this in his response to Jeff’s post. Avi argues that rather than comment the code, you should comment the business justification for writing the code that way. This may mean inserting reference to particular requirements or issue reports.

So yes, favour code that is self-explanatory, but I don’t believe that you can always achieve the necessary clarity without a few well-placed comments to aid understanding. Code that is obvious to the author today is rarely obvious to the maintainer next year (or even to the author next month).

And if you still really believe that your code does not need any comments, then I hope we never have to work together.

No comments: