How Much Unit Testing is Enough?
Editorial note: I originally wrote this post for the TechTown blog. You can check out the original here, at their site. While you’re there, have a look at their training and course offerings.
I work as an independent consultant, and I also happen to have created a book and some courses about unit testing. So I field a lot of questions about the subject. In fact, companies sometimes bring me in specifically to help their developers adopt the practice. And those efforts usually fall into a predictable pattern.
Up until the moment that I arrive, most of the developers have spent zero time unit testing. But nevertheless, they have all found something to occupy them full time at the office. So they must now choose: do less of something else or start working overtime for the sake of unit testing.
This choice obviously inspires little enthusiasm. Most developers I meet in my travels are conscientious folks. They’ve been giving it their all and so they interpret this new thing that takes them away from their duties as a hindrance. And, in spite of signing a contract to bring me in, management harbors the same worry. While they buy into the long term benefit, they fret the short term cost.
All of this leads to a very predictable question. How much unit testing is enough?
Clients invariably ask me this, and usually they ask it almost immediately. They want to understand how to spend the minimum amount of time required to realize benefits, but not a second longer. This way they can return to the pressing matters they’ve delayed for the sake of learning this new skill.
Defining Unit Testing
Now, before we go any further, let’s establish a working definition of unit testing. You might have something specific in mind when you hear this term, but it does cause a fair bit of confusion.
For instance, early in my career, I remember a dev manager with a curious (to me) definition of unit testing. He meant making changes to the application and then running it to see what happened. He reasoned that if you’d just, say, added a new text box to a form, that addition represented a “unit.” So next you would “unit test” by compiling and running the app and observing that it now, in fact, had your new text box. Given his definition of the word unit, it’s hard to argue his semantics. Nevertheless, I do not plan to use this definition.
Instead, I’ll go with something much closer to Martin Fowler’s definition or what the folks at Industrial Logic call a microtest. The terminology gets confusing because of the vagueness of the term “unit.” Let’s address this by thinking of a unit as a method/function in your code. You use unit tests to test methods in your code.
Now, moving beyond that, we can describe their specific properties, so as not to confuse them with things like integration tests, system tests, and acceptance tests. First, we’ll only talk about automated tests, which knocks the aforementioned dev manager’s definition right out. In terms of scope, unit tests operate on pieces of code in your codebase in isolation, and they run quickly. If it triggers a file write or a the coordination of many classes in your codebase, that’s not a unit test. And, finally, done right, these things tend to be small — only a handful of lines of code each.
How Teams Quantify Their Unit Testing
In my experience, teams in the early stages of adoption try to quantify the testing effort in one of three basic ways. I’ll list those quickly and then describe them in more detail.
- “Amount” of test code (number of unit tests or lines of test code)
- Code coverage (usually statement or branch coverage)
- Time spent on the test suite
For the first consideration, teams decide to shoot for some amount of unit test code. I’ve seen more variants than I can really recall, but some common ones include raw number of tests, raw lines of test code, number of test per class, and number of tests per method. Whatever the flavor, they chase some ratio of test/verification code to production code.
Second, some aspiring unit test practitioners pursue coverage. When people say that their test suite covers a given statement or branch in the codebase, they mean that some test in the suite causes that statement to execute or traverses that branch. Coverage tells you nothing but this.
And, finally, I’ll see them establish amount of time to spend on the suite. Generally, this means a percentage of the team’s work week or something. From now on, spend 90% of your time writing code and 10% working on unit tests.
One or more of these things, they reason, will ensure that the team does “enough” unit testing.
Getting Past the Proxies
Let’s now get down to brass tacks. When management decides on some metric like this, they’re pursuing some kind of profit and loss goal. In other words, management doesn’t say, “gosh, I’d love to work at a company with 65% code coverage.” Instead, they say something more like this.
Our reputation is taking an absolute beating lately with all of these bugs. We need to fix this, and I read on cio.com that unit testing can fix that. But we can’t just tell the developers to unit test — we need to measure adoption of the practice. We’ll know we’re in good shape when test coverage hits 65%.
The unit test quantification thus represents a proxy for what the business actually needs. In this case, it needs to stop shipping code with defects to stem the tide of dissatisfied customers.
That tide translates into lost money. So the business suddenly becomes willing to hire some consultant to come in and to help stop the bleeding. The business views the unit test initiative as an investment. But the business struggles to understand when it realizes its return on this investment, so it guesses with proxies.
Cost-Benefit of a Test Suite
I’ll punt for now on the matter of “enough” unit testing. First, I want to establish that your test suite has benefits and it also has costs.
To drive this point home, consider an extreme example. Imagine an expansive test suite. The development team spent months implementing it. It involves tens of thousands of tests, and those tests cover 100% of all statements and branches in your entire codebase. But the test suite never asserts (tests) anything. That’s right. Its tests just call methods in the code care not a whit what happens. This test suite hits all of the metrics, but it isn’t enough. How could it be enough when it confers zero benefit to the business?
This test suite does have a cost, though. The team spent months on it and they collect salary. As they change the code in the future, they’ll need to continue changing the tests, which takes more effort. And that doesn’t account for opportunity cost, either.
Now you won’t wind up implementing a test suite with cost and no benefit. But neither will (can) you implement one with all benefit and no cost. Instead, you can only implement a unit testing suite as efficiently as possible — trying to realize the most benefit for the least cost, ensuring that the thing pays for itself.
Enough Unit Testing
That leads back to the tricky question of how much represents enough. If you setup a proxy metric, you’ll hit it. I promise. Your team will write 1,000 tests or achieve 65% coverage, even if it has to write zero-benefit unit tests in order to hit those goals. I’ve seen that play out over and over again.
So forget the proxy goals and understand that it’s really hard to answer the core question here. Enough is going to vary a lot, depending on context, business needs, team skill, and many more factors. But what won’t vary is the business calculation involved. You have enough unit testing when adding additional tests costs you more than those tests will save you. This may mean simple dollars, or it might be a function of risk.
If that seems wishy-washy, I have good news. While you won’t ever arrive at the point where you can say, “okay, stop at unit test 3,882 because its marginal cost exceeds its marginal benefit,” you can develop an intuitive feel for this as a group. As your team upskills in unit testing, it becomes more efficient at it, bringing the cost way down. It also develops a sense for these diminishing returns. “I’ve covered all of the decision points in this method, so adding more tests now would be a waste.”
Unit testing is a long play and a long range investment, but I’ve rarely seen shops express regret for making it. So how much unit testing is enough? Get your team really good at doing it, and, using their knowledge of your business and code, they’ll be able to tell you far better than I can.