Stories about Software


Comments in Clean Code? Think Documentation

Editorial Note: I originally wrote this post for the SubMain blog.  You can check out the original here, at their site.  While you’re there, take a look at GhostDoc for your documentation needs.

Second Editorial Note: I recently appeared on the Ruby Rogues podcast and was interviewed by Paysa.  If you’re interested, check both of them out!

Notwithstanding some oddball calculator and hobby PC hacking, my first serious programming experience came in college.  A course called “Intro to C++” got us acquainted with arrays, loops, data structures and the like.  Given its introductory nature, this class did not pose a particularly serious challenge (that would come later).  So, with all of the maturity generally possessed by 18 year olds, we had a bit of fun.

I recall contests to see how much application logic we could jam into the loop conditions, and contests to see how much code could be packed onto one line.  These sorts of scavenger hunt activities obviously produced dense, illegible code.  But then, that was kind of the point.

Beyond these silly hijinks, however, a culture of code illegibility permeated this (and, I would learn later) other campuses.  Professors nominally encouraged code readability.  After all, such comments facilitated partial credit in the event of a half-baked homework submission.  But, even still, the mystique of the ingenious but inscrutable algorithm pervaded the culture both for students and faculty.  I had occasion to see code written by various professors, and I noticed no comments that I can recall.

Professionalism via Thoroughness

When I graduated from college, I carried this culture with me.  But not for long.  I took a job where I spent most of my days working on driver and kernel module programming.  There, I noticed that the grizzled veterans to whom I looked up meticulously documented their code.  Above each function sat a neat, orderly comment containing information about its purpose, parameters, return values, and modification history.

This, I realized, was how professionals conducted themselves.  I was hooked.  Fresh out of college, and looking to impress the world, I sought to distinguish myself from my undisciplined student ways.  This decision ushered in a period of many years in which I documented my code with near religious fervor.

My habit included, obviously, the method headers that I emulated.  But on top of that, I added class headers and regularly peppered my code with line comments that offered such wisdom as “increment the loop counter until the end of the array.”  (Okay, probably not that bad, but you get the idea).  I also wrote lengthy readme documents for posterity and maintenance programmers alike.  My professionalism knew no bounds.

Clean Code as Plot Twist

Eventually, I moved on from that job, but carried my habits with me.  I wrote different code for different purposes in different domains, but stayed consistent in my commenting diligence.  This I wore as a badge of pride.

While I was growing in my career, I started to draw inspiration from the clean code movement.  I began to write unit tests, I practiced the SOLID principles, I watched Uncle Bob talks, made my methods small, and sought to convince others to do the same.  Through it all, I continued to write comments.

But then something disconcerting happened.  In the clean code circles I followed and aspired to, I started to see posts like this one.  In it, the author had written extensively about comments as a code smell.

Comments are a great example of something that seems like a Good Thing, but turn out to cause more harm than good.

For a while, I dismissed this heresy as an exception to the general right-thinking of the clean code movement.  I ignored it.  But it nagged at me nonetheless, and eventually, I had to confront it.

When I finally did, I realized that I had continued to double down on a practice simply because I had done it for so long.  In other words, the extensive commenting represented a ritual of diligence rather than something in which I genuinely saw value.

Down with Comments

Once the floodgates had opened, I did an about-face.  I completely stopped writing comments of any sort whatsoever, unless it was part of the standard of the group I was working with.

The clean coder rationale flooded over me and made sense.  Instead of writing inline comments, make the code self-documenting.  Instead of comments in general, write unit and acceptance tests that describe the desired behaviors.  If you need to explain in English what your code does, you have failed to explain with your code.

Probably most compelling of all, though, was the tendency that I’d noticed for comments to rot.  I cannot begin to estimate how many times I dutifully wrote comments about a method, only to return a year later and see that the method had been changed while the comments had not.  My once-helpful comments now lied to anyone reading them, making me look either negligent or like an idiot.  Comments represented duplication of knowledge, and duplication of knowledge did what it always does: gets out of sync.

My commenting days were over.

Best of All Worlds

That still holds true to this day.  I do not comment my code in the traditional sense.  Instead, I write copious amounts of unit, integration and acceptance tests to demonstrate intent.  And, where necessary and valuable, I generate documentation.

Let’s not confuse documentation and commenting.  Commenting code targets maintenance programmers and team members as the intended audience.  Documenting, on the other hand, targets external consumers.  For instance, if I maintained a library at a large organization, and other teams used that library, they would be external consumers rather than team members.  In effect, they constitute customers.

If we think of API consumers as customers, then generating examples and documentation becomes critically important.  In a sense, this activity is the equivalent of designing an intuitive interface for end-users of a GUI application.  They need to understand how to quickly and effectively make the most of what you offer.

So if you’re like me — if you believe firmly in the tenets of the clean code movement — understand that comments and documentation are not the same thing.  Also understand that documentation has real, business value and occupies an important role in what we do.  Documentation may take the form of actual help documents, files, or XML-doc style comments that appear in Intellisense implementations.

To achieve the best of all worlds, avoid duplication.  Make publishing documentation and examples a part of your process and, better yet, automate these activities.  Your code will stay clean and maintainable and your API users will be well-informed and empowered to use your code.

  • Comments about _what_ the code does are indeed pointless, but information about _why_ it does what it does are very often beyond what clean code allows you do. And that information can at times be very helpful.

    • Most of the comments I’ve ever seen explaining the “why” of an implementation are more or less either excuses or apologies, both of which tend to make me sympathize with the author, but without finding the information particularly useful. YMMV. But if I find myself apologizing or explaining myself to a future programmer in the code, I generally stop and ask instead how I could change things to make apologies/explanations unneeded.

      • Sometimes it involves business constraints, sometimes it’s because of library or technology limitations, and sometimes it’s due to edge-cases which aren’t apparent at first (or even second) glance, which require the specifics of the current implementation. None of those clasify as apologies.

        • Couldn’t edge cases and tech specifics be expressed with automated tests or code contracts? And couldn’t business constraints be expressed through the ALM/requirements tool?

          (I’m not being Socratic — I’m interested in why these things wouldn’t work where comments would)

          • They could be expressed in unit tests, in the sense that those tests make sure that the edge cases are covered, but that still does not tell the developers why those edge cases exist (which is something that a comment on the unit test could fix, but then why not add it to th code itself).

            As to code contracts, either those live in the code and we’re giving ‘comments’ another name, or they live outside the code meaning the developer has to know to go look at them in this specific case (meaning they already knew about the edge case somehow) or they always have to go read through them. Not to mention that any text outside the code is much more likely to loose consistency with what the code actually does.

            All in all, the main reason why I’d want ‘why’ comments on code (when useful) is because they are more likely to give the developer the developer the needed info, even when time is short, when documentation or requirement spec is out of date or badly managed, and/or when you’re dealing with legacy code.

          • FWIW, what I meant about code contracts was automated enforcement. For instance, you could use an annotation/attribute (Java/C#, respectively, depending on your poison) to specify that a parameter cannot be null, and then choose an automated enforcement paradigm (static check, runtime exception, etc). Generally speaking, I always look for automatable constructs whenever possible.

            I see the convenience of having “why” documented inline versus on some Sharepoint site somewhere, but my experience has always been that these “why” comments tend to age just as badly as the out of band specs, unless it’s the same person who made the original comment always making changes.

            As a consumer of them, after people have left, I almost invariably chalk them up as noise. Reason being, I have to make a series of assumptions. “I’ll believe this comment, assuming the things that were true when written are still true, assuming that guy knew what he was talking about, assuming no one has changed the code in the interim, assuming… etc.”

          • Indeed, those tools and the fact regarding aging does mean that comments (even ‘why’ comments) need to be used judiciously, both in relation to adding them and to taking them at face value. That’s why I treat them in code review as being equally important as ‘other code’ (that being team attitude, or not, will of course mean YMMV).

            In any case, since it might not have been apparent, thanks for the post. As per usual, it was an interesting read, as was this follow-up 🙂

  • If we were working on the same project, I would tell you to comment your &*☠# code. 😉

    I like your stance on “documentation” (I call them contract comments) but would go a little farther – giving each class an explanatory, high-level paragraph of what it aims to do really helps maintainers navigate a larger code base.

    I think you’re missing context comments (technicaland historical), though. They largely address the why, something neither clean code nor tests can even begin to do. These are immensely helpful when debugging or extending code and age well as they usually describe a specific point in time (“A and B didn’t work out because of … so we went woth C”), which means they do not start lying over time.

    • It seems like those wouldn’t rot until someone went in later and did B instead, after all. Then you’d have a comment about B explaining that A and B don’t work, so the implementation is really C. At least, until someone comes along and mercifully deletes all the comments because that would be incredibly confusing.

      It’s hard for me to speak in generalities on the subject, without looking at specific code (e.g. if you had an example on Github or something we could look at), but whenever I find myself looking at departed developers’ explanations of “why” I usually just add a bunch of charcterization tests, ignore the comments, and make the requisite changes. Those things, could theoretically be useful, but the reality is usually that there’s no way to know if the original assumptions are still true or if that person knew what he or she was doing. “I’m using recursion for this Fibonacci number calculator because Bill the architect thinks iterative algorithmsm are a tool of the devil.”

      It’s hard to talk about team norms in generalities, though. If people like to explain themselves in code with comments, I have no real objection. My eyes tend to scan over comments as noise, so it has little bearing on me. On the other end of it, I can’t recall being on any project where anyone requested comments for any reason other than conformance to a standard. If someone asked for this and made a compelling case for how they’d use it, I wouldn’t be opposed to doing it.

      • A recent example that made the rounds on Twitter: link.

        My opinion on context comments is that they’re not meant to always be true. They are much like commit messages or issue descriptions (but far easier to access) in that they reflect a truth at a certain point in time. They give you context of why a decision was made and of assumptions that were true at the time.

        So deciding whether or not the original assumptions (as well as conclusions by the way) are still true is very much the point of context comments. Unlike contracts they don’t promise anything, so them being wrong is much less of an issue.

        • I think it must be a matter of what you’re used to seeing or expect to see. For instance, when I look at the snippet in that tweet, my immediate thought is, “gah, who cares, where are the tests?” And I realize that even over the years where I added comments to code religiously, I didn’t really consume them as anything other than considering them a code smell.

          For instance, imagine a codebase we both recently looked at, and imagine the guy in charge of that codebase for a time leaving a bunch of (questionable) explanations of what he did and why. With stuff like that, unless I wanted to put together a psychoanalysis of that guy, I would just consider his writing to be noise.

          If you have a tight knit team with a norm of communicating that way, I imagine that form of communication, essentially inline design documents, could be valuable. But I’ve spent most of my career modifying code written by long-departed developers, often commenting because someone cracks their knuckles with a ruler if they don’t. The broad effect of that experience was to cause code comments to register in my brain only as noise.

        • ratliffchrisb

          I’m not convinced code shouldn’t have comments, but this code would be much easier to read if it followed other clean code values and doesn’t really prove comments are required. Some obvious problems are.

          There are no returns in this block of code. The return conditions aren’t at the head of the code block. Even assuming this is to refer to only the ORACLE case this makes the visible block of code more confusing.

          I’m guessing this is an abbreviation for context. It can be seen the object has contextual metadata, but it also appears to be executing statements and returning results. This is clearly much more than context and horribly named as such. A fitting name alone would make this code much easier to read.

          e.getErrorCode() == 17283
          A define for ORACLE_ERROR_NO_RULESET would make things much clearer.

          I’m guessing this is here as a process requirement for having to explain bug fixes in comments, but from what I can see the inclusion of why they didn’t do something generates more confusion the it’s absence.

          There are a couple smaller issues, but I think in this case good code would not need comments. But of course, all of us write good code and agree what good code is. Similarly, I’ve seen some of the most ardent “No comments ever” supporters write incomprehensible code where comments would greatly improve readability.

          • Your comments regarding returns and ctx could be discussed but as far as I can tell they do not touch on the first of the three comments – it would still be needed.

            You are right about ORACLE_ERROR_NO_RULESET but that does not replace the rest of the comment, which is really helpful to guard against future programmers thinking “WTF, this can’t happen!”

            I understand the final comment to say, “executeQuery() would have been the obvious choice but it doesn’t work for XYZ (see #1232 where we got burned by this”. that being said, there is only one guy working on that code base so any process he follows seems to provide value for him.

  • dave falkner

    We can do this the easy way: “Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’ Improve the code and then document it to make it even clearer.” ― Steve McConnell

    Or we can do this the hard way: “Every time you write a comment, you should grimace and feel the failure of your ability of expression.” ― Robert C. Martin

    • “There is some fiction in your truth, and some truth in your fiction. To know the truth, you must risk everything.” ― Neo

      Two ways to choose from but both lead astray.