Stories about Software


Language Basics from Unit Tests

Let’s say that in a green field code base someone puts together a type that conceptually is a collection of non-integer values. For the sake of discussion, let’s call it a graph. A graph object might store a series of two-element tuples or perhaps a series of some value type like “point.” The graph might then perform operations on this data, such as IncreaseX() or IncreaseY() or Invert() or Divide()–operations that iterate through the points and do things to them. The actual mechanics of this don’t matter a whole lot. It’s the concept that’s important.

Now let’s say that in the graph the internal representation of the points is a floating point data type such as, well, float. I’m going to save the nuance of floating point arithmetic for a future practical math post, but suffice it say that floats can exhibit some weird-seeming behavior when it comes to comparisons, truncation/rounding, certain kinds of casting and type representations, etc.

And let’s also say that the person responsible for authoring this graph class hasn’t read a practical math post about floating point arithmetic and is completely oblivious to these potential pitfalls.

And, finally, let’s say that this graph class becomes a mainstay of the business logic in a particular application. It’s modified, extended, and relied heavily upon without a whole lot of attention paid to its internal workings. At least until stuff mysteriously doesn’t work. But when that happens, the culprit isn’t immediately obvious, so strange work-arounds and cargo-cult, oddball solutions spring up when symptoms occur. Extension methods are written, and sometimes entirely different modules are added to the code base because the existing one is “tricky” or “not to be trusted.”

At the application level, this causes maintenance issues, a lot of heated and fruitless arguments, and voodoo approaches to code. From a user interface perspective, this causes quirky behavior. Occasionally a linear graph is completely displaced out of the graph and rendered on some menu somewhere, or the screen goes blank for a few seconds and then the display is restored. Defects and defect reports are created and developers dispatched to track down the issue, but after a few days of fruitless efforts, some project manager quietly sets the defect’s priority from “critical” to “cosmetic” and the software is shipped. It’s embarrassing, but whatcha gonna do. Ya know, computers have a mind of their own sometimes!


Catching it Early

What if, instead of doing things the old-fashioned but all-too-common way, the authors of this code had been writing unit tests and/or practicing TDD? Well, there’s a very good chance that the issue stemming from the graph library is caught immediately as its API methods are being fleshed out from a functionality perspective. There’s a good chance that someone is writing a test and gets to the point that we were at in the code sample above, where they are utterly dumbfounded as to why 1+1 does not equal 2 in float land.

And then, good things happen. The developer in question takes to google or stack overflow, or perhaps he talks to other, more experienced developers on his team. He then gets an explanation, learns something about the language, and leaves the code in a correct state. Contrast this with the non-tested approach of “code it up, build a bad house on the bad foundation, and then ship the result because it’s too late.”

And what if the TDD/unit tests don’t expose this issue? Well, what they’ll do in either case is decouple the code base. So when the issue eventually does crop up via weird GUI behavior, it will be much easier to isolate. When it’s isolated, it will be much easier for the unit-test-savvy developers to write a test that exposes the defect to learn the lesson and fix the issue. It’s still a win.

The point about unit tests helping catch errors and leading to a more decoupled design is hardly controversial. But the benefits go beyond that. Unit tests provide a fast feedback loop for all points in the code base, which lends itself very well to poking and prodding things and experimenting. And that, in turn, leads to better understanding of not only the code, but also the language. If you can execute and get feedback on code extremely quickly, you’re much more likely to ask questions like, “I wonder what happens if I do x…” and then to do it and see. And that sort of experimentation, much like immersion in natural language, leads much more quickly to fluency.