Intro to Unit Testing 8: Test Suite Management and Build Integration

It’s been over a month now since my last post in this series, and for that I sort of apologize. I think I’ve been channelling all of my instructive energy into my now-finished Pluralsight course, leaving the blog largely for opinions, screeds, and a random hiring announcement. So, let’s get back on track and wrap this thing up. I have this post and another one slated and then we can call it a day.

So far, I’ve talked quite a lot about how and when (and when not) to write unit tests. I’ve offered up some techniques for helping you isolate the classes that you want to test, including the use of test doubles. And finally, I offered some advice on how to get people to leave you alone and let you write tests. So now I’d like to turn and offer some advice beyond just writing the things. You need to live with them, manage them and leverage them over the course of time.

Managing the Suite

You’ve built them. So, now what? At some point, you’ll wonder exactly when you’re getting started. For the first few or even few dozen classes you test, you’ll alternate between some exasperation at spending extra time doing something new and satisfaction at, well, doing something new. But then, at some point, you’ll be sitting around and notice that your test suite has like 400 tests and think, “wow, that’s a lot of code… do I really want all this?”

That feeling will hit you even harder when you go to change something under a tight deadline and your real quick change makes a test go red. You’re pretty sure the test is broken because it was testing the old way of doing things, so you really just want to comment out the test and you wonder why it’s such a pain to change the code. Why do you have to waste so much time to change one line of code?

The answer to these questions lies in practice but also effective test suite management. If you let the unit test suite become a boat anchor, it will drag you down. Your frustration will be real and reasonable, rather than just a temporary product of you being in a hurry and unfamiliar with working in a code base under test. You need to take care to prevent this from happening, and I’m going to tell you how in this section.

Name Your Tests Clearly and Be Wordy

When you’re writing a unit test, you’re looking at code. But when you’re running your test suite, you aren’t most of the time, and when you’re trying to understand why a run or a build failed, you’re never looking at code. When the test suite is failing, you don’t want to waste time figuring out why. And having to open the IDE, navigate to the test, read the code and figure out the problem is a waste of time.

Don’t give your test methods names like “Test24″ or “CustomerTest” or something. Instead, give them names like “Customer_IsValid_Returns_False_When_Customer_SocialSecurityNumber_Is_Empty”. That method name may seem ridiculous, especially if you’re used to giving methods short names, but trust me, you’ll be thankful for it. When your build is failing, which of these method names would you rather see an X next to? Would you rather be saying “looks like test 24 is failing,” or would you rather be saying, “oh, I wonder why someone made it so that an empty SSN is now considered valid?” If you say the first one, you’re lying.

This may seem unimportant in the scheme of things, but it’s the difference between associating frustration and confusion with your test suite and viewing it as a warning system for potentially undesirable changes. The test suite needs to be communicating clearly to you what’s wrong. Descriptive test names help do that and they help you identify whether it’s your code or the test itself that needs to be changed in the face of changing requirements.

Make Your Test Suite Fast

Ruthlessly delete and cull out slow tests. I can’t say it more plainly than that. A good test suite runs in seconds, max. If yours starts to take minutes, or God forbid, hours, then it’s rotting and becoming useless to you. Think of it this way — if it takes several minutes to run the test suite, how often are you going to do it? Every time you make a change, or just when you check in? If it takes hours, will you ever run it voluntarily?

If your test suite takes a long time to run, nobody will run it. Short feedback loops are of paramount importance to developers, and we optimize for efficiency. If the unit test suite is inefficient, we’ll find other ways to get feedback. As such, it is incredibly important to ensure that your test suite always runs quickly. Treat it as if the rest of your team were waiting for any legitimate excuse not to use the test suite, and don’t let inefficiency be that excuse.

Test Code is First Class Code

A common mistake that I see among those relatively new to testing is test code that’s something of a mess. The code will be brittle, heavily duplicated, weird, and hard to read. In short, your tests and test classes will contain code that you wouldn’t be caught dead putting into production.

Don’t do that. Treat your test code as if it were any other code. Eliminate duplication. Factor common functionality out into methods. Be descriptive with naming and with the flow of the method. Keep that code clean. I get that there’s a desire when it comes to testing to make as much of a mess as possible in the “bug bash” sense of throwing chaos at the situation and proving that your code can handle it, but the chaos needs to be controlled, and you can control it by keeping your test code clean and maintainable. If the tests are clean and easy to maintain, people won’t mind going in periodically to make an adjustment. If they’re unruly, people will get annoyed and comment them out or stop running them.

Have a Single Assertion per Test

This is a subtle one, but it also goes toward maintainability. If you start writing tests that have 20 asserts in them, you may feel good that you’re exercising a whole section of the code, but really you’re making things hard for yourself later. If all 20 tests pass (or at least the first 19), then all will be executed. But if the first one fails, none of the rest get executed. This means that in test methods with lots of asserts, it’s not always clear where they’re failing, which means it’s not always clear what’s going wrong.

In order for your test suite to be an asset, it has to be a clear indicator of what’s going wrong. Which would you find more useful in your car: a series of many different lights with helpful diagrams that lit up to indicate a problem, or one unlabeled red light that came on whenever anything at all was wrong? If you had that latter light and it could mean anything from your gas being low to you being out of wiper fluid to imminent destruction of your transmission, I bet you’d just start ignoring it after a while.

Don’t Share State Between Your Tests

There is no more surefire way to drive yourself insane at some future date than by storing some kind of application state among unit tests being executed. What I mean is if you have some test A that declares sets a global counter variable to 1, and then you have another test B that depends on the global counter being set to 1 in order for it to succeed, you are in for a world of hurt.

The problem is that there is no guarantee that the unit test runner will execute the tests in any particular order. What’s likely to happen is that your tests get executed in a particular order whenever you run them on your machine, so everything goes fine. But when the build machine runs them they fail. Weird. So you check them on your friend Bob’s machine, and they pass there. But on Alice’s machine, they fail. If you didn’t already know why this was happening because I just told you, can you imagine how much of your hair you’d pull out? You’d probably be checking the IDE version on those machines, compiler information, OS settings, and God only knows what else. It’d be a wild goose chase.

And imagine if it worked on everyone’s machine initially and then six months later started failing occasionally on the build machine. Machine isn’t the only failing dimension — there’s also time. So please, whatever you do, do not have your unit tests depend on the execution of a previous test. This practice, more than any other, is likely to lead to a rage-quitting of unit testing as a practice where you simply take all of them out of the build.

Encourage Others and Keep Them Invested

This sounds like a strange one to round out the section, but it’s important. If you’re the only one fighting the good fight with unit tests, it becomes daunting and exasperating. Everyone else’s reaction to failing tests is annoyance and they’re waiting for excuses just to stop altogether. You wind up feeling that you’re in an adversarial relationship with the team (I speak from experience here). But if you get others to buy in, you’re not shouldering the burden alone and you have help keeping the suite healthy and helpful.

Build Integration

When you first start out unit testing, the tests will be sort of disorganized and haphazard. You’ll write a few to get the hang of it and then maybe discard them. After a bit of that, you’ll start checking them into your solution (unless you’re an incorrigible weirdo or a liar). You do that, and the suite grows and, ideally, everyone is running it locally to keep things clean and be notified of potential breaking changes.

But you have to take it beyond that at some point if you want to realize the full value of the unit tests. They can’t just be a thing everyone remembers to do locally on pain of nagging emails or because someone will buy the team donuts or some other peer-pressure-oriented demerit system. Failing unit tests have to have real (read: automated) consequences. And the best way to do this is to make it so that failing unit tests mean a failing build.

If you’re in a shop that’s not as formal, this may be difficult at first. One handicap may be that you’re reading this and saying “what do you mean by ‘the build?'” If what you do is write code and take some kind of executable out of your project’s output directory on your machine and push it to a server or to your users, you’ve got some work to do before you think about integrating unit tests. You need a build. A build is an automated process by which your source code is turned into a production-ready, deployable package. And it’s automated in the sense that it doesn’t involve you hitting Ctrl-Shift-B or Ctrl-F6 or whatever you do manually in your IDE to build. The Build, with a capital B, is a process that checks your code out of source control, builds it, runs checks and whatever else is necessary, perhaps increments the versioning of the executables, etc., and then spits out the final product that will be pushed to a server or burned onto a DVD or whatever. If you want to read more about build tools, you can google around about TeamCity, CruiseControl, TFS, FinalBuilder, Jenkins etc. And you don’t have to use a product like that — you can create your own using shell scripts or code if you choose.

Because of all the different options when it comes to programming languages, unit test technologies and build tools, I’m not going to offer a tutorial on how to integrate unit tests into your build. To be comprehensive, I’d need to give dozens of such tutorials. But what I will say is that your integration is going to take the same basic format no matter what tools you’re using. The build is a series of steps that passes if everything goes smoothly and the deliverables are ultimately generated. If a step in the build fails, then the build itself fails. What you need to do is add a step that involves running the unit tests. With this in place, you’re creating a situation where any failing unit test means that the entire build fails.

Conceptually, this is pretty straightforward. Unit test runners can be run in command line fashion and they’ll generate a return value of some kind. So the build tool needs to examine the test runner’s output for an error code. If it finds one, it puts the brakes on the whole operation.

It may seem extreme at first to torpedo the whole build because of a failing unit test, but when you think about it, what else should possibly happen? Why would you want a process that allowed you to ship code knowing that it was defective in a way that it didn’t used to be? That’s amateur hour. And, what’s more is that if your team starts understanding that failed unit tests mean a failed build they’ll be sure to run the tests before check-in so that they don’t fail. It will become a natural part of your process, and the quality of your software will be dramatically improved for it.

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1471()

  • Andreas

    If possible, run tests in random order. This should help guard against tests sharing state.

  • http://www.danestuckel.com/ Dane Stuckel

    Until you get that randomly failing test. I’ve implemented this, and it turned into hell.

    This was right up there with the test that was failing because they were using the actual time for timestamps for some code under test. The test failed about 1/30 of the time, randomly.

    Instead of randomizing the tests, you’re better off with a standard that prevents any reuse of variable values between tests. That is, every variable that is touched in any set of unit tests is reset in either the teardown or setup. Always always always, even if your developers swear they’re being careful and that a certain variable won’t matter. The trick is that when someone ELSE writes a unit test, they won’t be as careful as that first programmer, and then you’re now in hell.

  • http://www.daedtech.com/blog Erik Dietrich

    Personally, I’d probably opt for a static analysis tool to guard against shared state. I mean, it’s possible to run the tests in a random order and still have them uphold some shared state dependency, but they can’t hide from static analysis :)

    If it works for you, I don’t have an issue with it, per se, but I prefer to just let the test runner do its default thing and guard against bad practices at code writing time.

  • Andreas

    I don’t get it. The randomly failing test is the entire point; it shows you that there is (probably) shared state somewhere.

  • Andreas

    That is of course the best solution, short of writing everything in a purely functional language. I guess I was speaking from my own experiences using primitive C++ and primitive tools.

  • http://www.danestuckel.com/ Dane Stuckel

    I suppose it depends on the scale of the project. If you have only a few people contributing, and you’re not relying on the unit tests as a commit hook (and a bastion of sanity), then it’s okay to use unit tests as a kind of smell check — that would be if the unit tests failed once out of every thirty times and you’re running it repeatedly. I have it set up that coverage must be at a certain level and that the tests have to pass to be able to get into the repository. Developers make too much money to block other projects just because someone committed broken code. : )

    The developers that I work with tend to run them only a few times as they’re either writing their tests or right before they push their commits. If the tests work often enough, they may not realize they’ve broken something or that they’re relying on shared state, or that the order of the tests matter. This is especially true if they’re writing all new tests for brand new code, because the unit tests will usually be closely related.