DaedTech

Stories about Software

By

Introduction to Unit Testing Part 6: Test Doubles

In the last two posts in this series, I talked about how to test new code in your code base and then how to bring your legacy code under test. Toward the end of the last chapter in this series, I talked a bit about the concept of test doubles. The example I showed was one in which I used polymorphism to create a “dummy” class that I used in a test to circumvent otherwise untestable code. Here, I’ll dive into a lot more detail on the subject, starting out with a much simpler example than that and building to a more sophisticated way to handle the management of your test doubles.

First, a Bit of Theory

Before we get into test doubles, however, let’s stop and talk about what we’re actually doing, including theory about unit tests. So far, I’ve showed a lot of examples of unit tests and talked about what they look like and how they work (for instance, here in post two where I talk about Arrange, Act Assert). But what I haven’t addressed, specifically, is how the test code should interact with the production code. So let’s talk about that a bit now.

By far the most common case when unit testing is that you instantiate a class under test in the “arrange” part of your unit test, and then you do whatever additional setup is necessary before calling some method on that class. Then you assert something that should have happened as a result of that method call. Let’s return to the example of prime finder from earlier and look at a simple test:

This should be reviewed from the perspective of “arrange, act, assert,” but let’s look specifically at the “act” line. Here is the real crux of the test; we’re writing tests about the IsPrime method and this is where the action happens. In this line of code, we give the method an input and record its output, so it’s the perfect microcosm for what I’m going to discuss about a class under test: its interactions with other objects. You see, unit testing isn’t about executing your code — you can do that with integration tests, console apps, or even just by running the application. Unit testing, at its core, is about isolating your classes and running experiments on them, as if you were a scientist in a lab. And this means controlling all of the inputs to your class — stimulus, if you will — so that you can observe what it puts out.

Controlling the inputs in the PrimeFinder class is simple. Because I’m telling you that there are no invocations of global/static state (which will become an important theme as we proceed). You can see by looking at the unit test that the only input to the class under test (CUT) is the integer 1. This means that the only input/stimulus that we supply to the class is a simple integer, making it quite easy to make assertions about its behavior. Generally speaking, the simpler the inputs to a class, the easier that class is to test.

There are Inputs and There are Inputs

Omitting certain edge cases I can think of (and probably some that I’m not thinking of), let’s consider a handful of relatively straightforward ways that a class might get ahold of input information. There is what I did above — passing it into a method. Another common way to give information to a class is to use constructor parameters or setter methods/properties. I’ll refer to these as “passive collaboration” from the perspective of the CUT, since it’s simply being given the things that it needs. There is also what I’ll call “semi-passive collaboration,” which is when you pass a dependency to the CUT and the CUT interacts in great detail with that dependency, mutating its state and querying it. An example of this would be “Car theCar = new Car(new Engine())”, in which performing operations on Car related to starting and driving result in rather elaborate modifications to the state of Engine. It’s still passive in the sense that you’re handing the Engine class to the car, but it’s not as passive as simply handing it an integer. In general, passive input is input that the scope instantiating the CUT controls — constructor parameters, method parameters, setters, and even things returned from methods of objects passed to the CUT (such as the Car class calling _engine.GetTemperature() in the example in this paragraph).

In contrast, there is also “active collaboration,” which is when the CUT takes responsibility for getting its own inpu. This is input that you cannot control when instantiating the class. An example of this is a call to some singleton or public static method in the CUT. The only way that you can reassume control is by not calling the method in which it occurs. If static/singleton calls occur in the constructor, you simply cannot test or even instantiate this class without it doing whatever the static code entails. If it retrieves values from static state, you have no control over those values (short of mocking up the application’s global state).

A second form of active collaboration is the “new” operator. This is very similar to static state in that when you create the CUT, you have no control over this kind of input to the CUT. Imagine if Car new-ed up its own Engine and queried it for temperature. There would be absolutely no way that you could have any effect on this operation in the Car class short of not instantiating it. Like static calls, object instantiation renders your CUTs a non-negotiable, “take it or leave it” proposition. You can have them with all of their instantiated objects and global state or you can write your own, buddy.

Not all inputs to a class are created equal. There are a CUT’s passive inputs, in which the CUT cedes control to you. And then there are the CUT’s active inputs that it controls and on which it does not allow you to interpose in any way. As it turns out, it is substantially easier to test CUTs with exclusively passive collaboration/input and difficult or even impossible to test CUTs with active collaboration. This is simply because you cannot isolate actively collaborating CUTs.

Literals: Too Simple to Need Test Doubles

There’s still a little bit of work to do before we discuss test doubles in earnest. First, we have to talk about inputs that are too simple to require stand-ins: literals. The PrimeFinder test above is the perfect example of this. It’s performing a mathematical operation using an integer input, so what we’re interested in testing is known input-output pairs in a functional sense. As such, we just need to know what to pass in, to pass that value in, and then to assert that we get the expected return value.

In a strict sense, we could refer to this as a form of test double. After all, we’re doing a non-production exercise with the API, so the value we’re passing in is fake, in a sense. But that’s a little formal for my taste. It’s easier just to think in terms of literals almost always being too simple to require any sort of substitution of behavior.

An interesting exception to this the null literal (of null type) or the default value of a non-nullable type. In many cases, you may actually want to be testing this as an input since null and 0 tend to be particularly interesting inputs and the source of corner cases. However, in some cases, you may be supplying what is considered the simplest form of test double: the dummy value. A dummy value is something you pass into a function to say, “I don’t care what this is and I’m just passing in something to make the compiler happy.” An example of where you might do this is passing null to a constructor of an object instance when you just want to make assertions as to what some of its property values initialize to.

Simple/Value Objects and Passing in Friendlies

Next up for consideration is the concept of a “test stub,” or what I’ll refer to in the general sense as a “friendly.”

Friendlies

Take a look at this code:

Here is an incredibly simple implementation of the Car-Engine pair I described earlier. Car is passed an Engine and it queries that Engine for a local value that it exposes. Let’s say that I now want to test that behavior. I want to test that Car’s EngineTemperature property is equal to the Engine’s temperature in fahrenheit. What do you think is a good test to write? Something like this, maybe —

Here, we’re setting up the Engine instance in such a way as that we control what it provides to Car when Car uses it. We know by inspecting the code for Car that Car is going to ask Engine for its TemperatureInFahrenheit value, so we set that value to a known commodity, allowing us to compare in the Assert. To put it another way, we’re supplying input indirectly to Car by setting up Engine and telling Engine what to give to Car. It’s important to note that this is only possible because Car accepts Engine as an argument. If Car instantiated Engine in its constructor, it would not be possible to isolate Car because any test of Car’s initial value would necessarily also be a test of Engine, making the test an integration test rather than a unit test.

Creating Bonafide Mocks

That’s all well and good, but what if the Engine class were more complicated or just written differently? What if the way to get the temperature was to call a method and that method went and talked to a file or a database or something? Think of how badly the testing for this is going to go:

Now, when we instantiate a Car and query its engine temperature property, suddenly file contents are being read into memory and, as I’ve already covered in this series, File I/O is a definite no-no in a unit test. So I suppose we’re hosed. As soon as Car tries to read Engine’s temperature, we’re going to explode — or we’re going to succeed, which is even worse because now you’ll have a unit test suite that depends on the machine it’s running on having the file C:\whatever.txt on it and containing an integer as its first line.

But what if we got creative the way we did at the end of the last episode of this series? Let’s make the TemperatureInFahrenheit property virtual and then declare the following class:

This class is test-friendly because it doesn’t contain any file I/O at all and it inherits from Engine, overriding the offending methods. Now we can write the following unit test:

If this seems a little weird to you, remember that our goal here is to test the Car class and not the engine class. All that the Car class knows about Engine is that it wants its TemperatureInFahrenheit property. It doesn’t (and shouldn’t) care how or where this comes from internally to Engine — file I/O, constructor parameter, secret ink, whatever. And when testing the Car class, you certainly don’t care. Another way to think of this is that you’re saying, “assuming that Engine tells Car that the engine temperature is 200, we want to assert that Car’s EngineTemperature property is 200.” In this fashion, we have isolated the Car class and are testing only its functionality.

This kind of test double and testing technique is known as a Fake. We’re creating a fake engine to stand-in for the real one. It’s not simple enough to be a dummy or a stub, since it’s a real, bona-fide different class instead of a doctored version of an existing one. I realize that the terminology for the different kinds of test doubles can be a little confusing, so here’s a helpful taxonomy of them.

Mocking Frameworks

The last step in the world of test doubles is to get to actual mock objects. If you stop and ponder the fake approach from the last section a bit, a problem might occur to you. The problem has to do with long-term maintenance of code. I remember, many moons ago when I discovered the power of polymorphism for creating fake objects, that I thought it was the greatest thing under the sun. Obviously there was at least one fake per test class with dependency, and sometimes there were multiple dependencies. And I didn’t always stop there — I might define three or four different variants of the fake, each having a method that behaved differently for the test in question. In one fake, TemperatureInFarenheit would return a passed in value, but in another, it would throw an exception. Oh, there were so many fakes — I was swimming in fakes for classes and fakes for interfaces.

And they were awesome… until I added a method to the interface they implemented or changed behavior in the class they inherited. And then, oh, the pain. I would have to go and change dozens of classes. And then there was also the fact that all of this faking took up a whole lot of space. My test classes were littered with nested classes of fakes. It was fun at first, but the maintenance became a drudgery. But don’t worry, because my gift to you is to spare you that pain.

What if I told you that you could implement interfaces and inherit from classes anonymously, without actually creating source code that did this? I’d be oversimplifying a bit, but once you got past that, you’d probably be pretty excited. I say this because, as you start to grasp the concept of mocking frameworks, this kind of “dynamic interface implementation/inheritance” is the easiest way to reason about what it’s doing, from a practical perspective, without getting bogged down in more complicated concepts like reflection and direct work with byte-code and other bits of black magic.

As an example of this in action, take a look at how I go about testing the Car and Engine with the difficult dependency. The first thing that I do is delete the Fake class because there’s no need for it. The next thing I do is write a unit test, using a framework called JustMock by Telerik (this is currently my preferred mocking framework for C#).

Notice that instead of instantiating an engine, I now invoke a static method on a class called Mock that takes care of creating my dynamic inheritor for me. Mock.Create() is what creates the equivalent of FakeEngine. On the next line, I invoke an (extension) method called Arrange that creates an implementation of the property for me as well. What I’m saying, in plain English, is “take this mock engine and arrange it such that the TemperatureInFahrenheit property returns 200.” I’ve done all of this in one line of code instead of adding an entire nested class. And, best of all, I don’t need to change this mock if I decide to change some behavior in the base class or add a new method.

Truly, once you get used to the concept of mocking, you’ll never go back. It will become your best friend for the purposes of mocking out dependencies of any real complexity. But temper your enthusiasm just a bit. It isn’t a good idea to use mocking frameworks for simple dependencies like the PrimeFinder example. The lite version of JustMock that I’ve used and many others won’t even allow it, and even if they did, that’s way too much ceremony — just pass in real objects and literals, if you can reasonably.

The idea of injecting dependencies into classes (what I’ve called “passive” and “semi-passive” collaboration) is critical to mocking and unit testing. All basic mocking frameworks operate on the premise that you’re using this style of collaboration and that your classes are candidates for polymorphism (either interfaces or overridable classes). You can’t mock things like primitives and you can’t mock sealed/final classes.

There are products out there called isolation frameworks that will grant you the ability to mock pretty much everything — primitives, sealed/final classes, statics/singletons, and even the new operator. These are powerful (and often long-running, resource-intensive) tools that have their place, but that place is, in my opinion, at the edges of your code base. You can use this to mock File.Open() or new SqlConnection() or some GUI component to get the code at the edge of your application under test.

But using it to test your own application logic is a path that’s fraught with danger. It’s sort of like fixing a broken leg with morphine. Passively collaborating CUTs have seams in them that allow easy configuration of behavior changes and a clear delineation of responsibilities. Actively collaborating CUTs lack these things and are thus much more brittle and difficult to separate and modify. The fact that you can come up with a scheme allowing you to test the latter doesn’t eliminate these problems — it just potentially masks them. I will say that isolating your coupled, actively collaborating code and testing it is better than not testing it, but neither one is nearly as good as factoring toward passive collaboration.

By

Seeing the Value in Absolutes

The other day, I told a developer on my team that I wouldn’t write methods with more than three parameters. I said this in a context where many people would say, “don’t write code with more than three parameters in a method,” in that I am the project architect and coding decisions are mine to make. However, I feel that the way you phrase things has a powerful impact on people, and I believe code reviews that feature orders to change items in the code are creativity-killing and soul-sucking. So, as I’ve explained to people on any number of occasions, my feedback consists neither of statements like “that’s wrong” nor statements like “take that out.” I specifically and always say, “that’s not what I would do.” I’ve found that people listen to this the overwhelming majority of the time and, when they don’t, they often have a good reason I hadn’t considered. No barking of orders necessary.

But back to what I said a few days ago. I basically stated the opinion that methods should never have more than three parameters. And right after I had stated this, I was reminded of the way I’ve seen countless conversations go in person, on help sites like Stack Overflow, and in blog comments. Does this look familiar?

John: You should never have more than three parameters in a method call.
Jane: Blanket statements like that tend to be problematic. Three method parameters is really, technically, more of a “code smell” than necessarily a problem. It’s often a problem, but it might not be.
John: I think it’s necessarily a problem. I can’t think of a situation where that’s desirable.
Jane: How about when someone is holding a gun to your head and telling you to write a method that takes four parameters.
John: (Rolls his eyes)
Jane: Look, there’s probably a better example. All I’m saying is you should never use absolutes, because you never know.
John: “You should never use absolutes” is totally an absolute! You’re a hypocrite!
Both: (Devolves into pointless bickering)

A lot of times during debates, particularly when you have smart and/or exacting participants, the conversation is derailed by a sort of “gotcha” game of one-upsmanship. It’s as though they are at an impasse as to the crux of the matter, so the two begin sniping at one another about tangentially-related or totally non-related minutiae until someone makes a slip, and this somehow settles the debate. Of course, it’s an exercise in futility because both sides think their opponent is the first to slip up. Jane thinks she’s won this argument because John used an absolute qualifier and she pointed out some (incredibly preposterous and contrived) counter-example, and John thinks he won with his ad hominem right before the end about Jane’s hypocrisy.

In this debate, they both lose, in my opinion. I agree with John’s premise but not his justification, and the difference matters. And Jane’s semantic nitpicking doesn’t get us to the right justification (counter or pro), either. Prescriptive matters of canon when it comes to programming are troubling for the same reason that absolutes are often troubling in our day-to-day lives. Even the most clear-cut seeming things, like “it’s morally reprehensible to kill people,” wind up having many loopholes in their application (“it’s morally reprehensible to kill people — unless, of course, it’s war, self-defense, certain kinds of revenge for really bad things, accidental, state-sanctioned execution, etc., etc.”). So for non-important stuff like the number of parameters to a method, aren’t we kind of hosed and thus stuck in a relativistic quagmire?

I’d argue not, and furthermore, I’d argue that the fact of the rules is more important than the rules themselves. It’s more important to have a restriction like “don’t have more than three parameters to a method” than it is to have that specific restriction. If it were “don’t have more than two method parameters” or “don’t have more than four method parameters,” we’d still be sitting pretty. Why, you ask? Well, a man named Barry Schwartz coined this phrase: “the paradox of choice: why more is less.” Restrictions limit choice, which is merciful

Developers are smart, and they want to solve problems — often hard problems. But, really, they want to solve directed problems efficiently. To understand what I mean, ask yourself which of these propositions is more appealing to you: (1) make a website that does anything in any programming language with any framework or (2) use F# to parse a large text file and have the running process use no more than 1 gig of memory. The first proposition makes your head hurt while the second gets your mental juices flowing as you decide whether to try to solve the problem algorithmically or to cheat and write interim results to disk.

Well, the same thing happens with a lot of the “best practice” rules that surround us in software development. Don’t make your classes too big. Don’t make your methods too big. Don’t have too many parameters. Don’t repeat your code. While they can seem like (and be, if you don’t understand the purpose behind them) cargo-cult mandates if you simply focus on the matter of relativism vs absolutes, they’re really about removing (generally bad) options so that you can be creative within the context remaining, as well as productive and happy. Developers who practice DRY and who write small classes with small methods and small method signatures don’t have to spend time thinking “how many parameters should this method have” or “is this class getting too long?” Maybe this sounds restrictive or draconian to you, but think of how many options have been removed from you by others: “does the code have to compile,” or “is the code allowed to wipe out our production data?” If you’re writing code for any sort of business purposes, the number of things you can’t do dwarfs the number of things you can.

Of course, just having rules for the sake of rules is the epitome of dumb cargo cult activity. The restrictions have to be ones that contribute overall to a better code base. And while there may be some debate about this, I doubt that anyone would really argue with statements like “favor small methods over large ones” and “favor simple signatures over complex ones.” Architects (or self-organizing teams) need to identify general goals like these and turn them into liberating restrictions that remove paralysis by analysis while keeping the code base clean. I’ve been of the opinion for a while now that one of the core goals of an architect should be providing a framework that prevents ‘wrong’ decisions so that the developers can focus on being creative and solving problems rather than avoiding pitfalls. I often see this described as “making sure people fall into the pit of success.”

PitOfSuccess

Going back to the “maximum of three parameters rule,” it’s important to realize that the question isn’t “is that right 99% of the time or 100% of the time?” While Jane and John argue over that one percent, developers on their team are establishing patterns and designs predicated upon methods with 20 parameters. Who cares if there’s some API somewhere that really, truly, honestly makes it better to user four parameters in that one specific case? I mean, great — you proved that on a long enough timeline, weird aberrations happen. But you’re missing out on the productivity-harnessing power of imposing good restrictions. The developers in the group might agree, or they might be skeptical. But if they care enough to be skeptical, it probably means that they care about their craft and enjoy a challenge. So when you present it to them as a challenge (in the same way speeding up runtime or reducing memory footprint is a challenge), they’ll probably warm to it.

By

Designs Don’t Emerge

I read a blog post recently from Gene Hughson that made me feel a little like ranting. It wasn’t anything he said — I really like his post. It reminded me of some discussion that had followed in my post about trying too hard to please with your code. Gene had taken a nuanced stand against the canned wisdom of “YAGNI.” I vowed in the comments to make a post about YAGNI as an aphorism, and that’s still in the works, but here is something tangentially related. Now, as then, I agree with Gene that you ignore situational nuance at your peril.

But let’s talk some seriously divisive politics and philosophy first. I’m talking about the idea of creationism/intelligent design versus evolutionary theory and natural selection. The former conceives of the life in our world as the deliberate work of an intelligent being. The latter conceives of it as an ongoing process of change governed by chance and coincidence. In the context of this debate, there is either some intelligent force guiding things or there isn’t, and the debate is often framed as one of omnipotent, centralized planning versus incremental, steady improvement via dumb process and chance. The reason I bring this up isn’t to weigh in on this or turn the blog into a political soapbox. Rather, I want to point out a dichotomy that’s ingrained in our collective conversation in the USA and perhaps beyond that (though I think that the creationist side of the debate is almost exclusively an American one). There is either some kind of central master planner, or there is simply the vagaries of chance.

I think this idea works its way into a lot of discussions that talk about “emergent design” and “big up front design,” which in the same way puts forth a pretty serious false dichotomy. This is most likely due, in no small part, to the key words “design,” “emergent” and especially “evolution” — words that frame the coding discussion. It turns into a blueprint for silly strawman arguments: “Big design” proponents scoff and say things like, “oh yeah, your architecture will just figure itself out magically” while misguided practitioners of agile methodologies (perhaps “no design” proponents) accuse their opponents of living in a coding universe lacking free will — one in which every decision, however small, must be pre-made.

But I think the word “emergent,” rather than “evolution” or “design,” is the most insidious in terms of skewing the discussion. It’s insidious because detractors are going to think that agile shops view design as something that just kind of winks into existence like some kind of friendly guardian angel, and that’s the wrong idea about agile development. But it’s also insidious because of how its practitioners view it: “Don’t worry, a good design will emerge from this work-in-progress at some point because we’re SOLID and DRY and we practice YAGNI.”

Now, I’m not going for a “both extremes are wrong and the middle is the way to go” kind of argument (absent any other reasoning, this is called middle ground fallacy). The word “emergent” itself is all wrong. Good design doesn’t ’emerge’ like a welcome ray of sunshine on a cloudy day. It comes coughing, sputtering, screaming and grunting from the mud, like a drowning man being pulled from quicksand, and the effort of dragging it laboriously out leaves you exhausted.

DragFromTheMud

The big-design-up-front (BDUF) types are wrong because of the essential fallacy that all contingencies can be accounted for. It works out alright for God in the evolution-creation debate context because of the whole omniscient thing. But, unfortunately, it turns out that omniscience and divinity are not core competencies for most software architects. The no-design-up-front (NDUF) people get it wrong because they fail to understand how messy and laborious an activity design really is. In a way, they both get it wrong for the same basic reason. To continue with the Judeo-Christian theme of this post, both of these types fail to understand that software projects are born with original sin.

They don’t start out beautifully and fall from grace, as the BDUF folks would have you believe, and they don’t start out beautifully and just continue that way, emerging as needed, as the NDUF folks would have you believe. They start out badly (after all, “non-functional” and “non-existent” aren’t words which describe great software) and have to be wrangled to acceptability through careful, intelligent and practiced maintenance. Good design is hard. But continuously knowing the next, feasible, incremental step toward a better design at absolutely any point in a piece of software’s life — that’s really hard. That takes deliberate practice, debate, foresight, adaptability, diligence, and a lot of reading and research. It doesn’t just kinda ’emerge.’

If you’re waiting on me to come to a conclusion where I give you a score from one through ten on the NDUF to BDUF scale (and it’s obviously five, right?), you’re going to be disappointed with this post. How much design should you do up front? Dude, I have no idea. Are you building a lunar rover? Probably a lot, then, because the Sea of Tranquility is a pretty unresponsive product owner. Are you cobbling together a minimum viable product and your hardware and business requirements may pivot at any moment? Well, probably not much. I can’t settle your design decisions and timing for you with acronyms or aphorisms. But what I can tell you is that to be a successful architect, you need to have a powerful grasp on how to take any design and make it slightly better this week, slightly better than that next week, and so on, ad infinitum. You have to do all of that while not catastrophically breaking things, keeping developers productive, and keeping stakeholders happy. And you don’t do that “up-front” or “ex post facto” — you do it always.

By

Understanding Degrees of Code Flexibility

In some projects I’ve been managing of late, I’ve noticed a continuous question cropping up: How flexible should we make the different parts of the system? I’m currently working with a bright crew of people, so they’re picking up on this quickly, but I thought I’d do a bit of a write-up to help the process along. And, as long as I’m doing write-ups like this, I might as well post them.

In discussions of software, there is a large issue that gets lost in the shuffle. You frequently hear people argue the merits of different styles of or approaches to programming. Unit testing or not? TDD? IoC or inline new? What’s the appropriate size of a method? ORM or inline SQL? SQL at all or NoSQL? You get the idea. But one thing that I find is often glossed over is the idea of system changes as a function of ease of making those changes. In other words, if a user comes a-hollering and says, “I want, nay, demand the ability to do X,” how hard is it to make that happen and to verify the results?

And by “hard,” I don’t mean, “do you write code for a day or for three weeks?” I mean, “what do the changes look like in terms of risk and deliverables?” In other words, can you make that happen by changing a configuration file or does it require code changes? Will you need to re-deploy or can you somehow patch on the fly through a plugin architecture? And is it testable? Can you verify 99% of the changes by swapping out a configuration setting, or do you have radically different production and test setups?

So I’m going to define some concepts to flesh out an idea. This isn’t exactly a formalized theory or anything. It’s rather just a working lexicon of how I think about my application. This is a scale of system flexibility for a given future change. Or, put another way, here is a way of assessing how much effort on the part of the entire development/operations group doing X for the aforementioned user will be, from least to most significant.

  1. Users can do it themselves.
  2. An IT-level change is required (e.g. changing a config file, swapping out images, etc.)
  3. An architect/dev change is required to configuration (e.g. XML for an IoC container)
  4. A non-compiled source code change is required (e.g. you update the markup for a site but not the underlying code)
  5. An Open/Closed Principle Compliant source code change is required (basically adding new code).
  6. A localized tweak to existing code is required.
  7. A substantial change to existing code is required spanning various modules.

Now when considering this list, it makes sense to assess your change-set in terms of the furthest down thing it requires. So maybe you need to change a logo on your website, which is easy, but you have an unwieldy switch statement somewhere that chooses swaps it out in certain circumstances that you now need to change. This is going to be probably a 6 rather than a 2. A given change is going to be as rigid as the most rigid link in the chain, so to speak.

Here are the kinds of changes described in more detail.

Users do it themselves.

There are some sorts of changes to the system that need not involve anyone from your team/staff/company. These are things that users do, through the application. An obvious example is a banking or commerce website in which users can change their passwords. “Password” has nothing to do with the business logic of commerce, so this is functionally a meta-piece of administrativa that you’re entrusting to users.

A dev-ops or operations person changes meta-data in production.

This is something that you (hopefully) don’t trust a user to do but that doesn’t require any actual knowledge of the code base. A good example of this might be a desktop application that has an XML configuration file (or an INI file, if you’re willing to show your age with me). This file might be modified to have the application point to a different database or log to a different file or something. This is not something the average user could or should do, but it’s a relatively lightweight change in that it requires only minimal training and no re-deployment of any kind.

An architect or developer changes meta-data in production.

The next step down in operational rigidity is a meta-data change that cannot be performed without an understanding of the code base. The best example here is the configuration of an IoC Container that has been extracted to XML. Figuring out which service is used by which ViewModel is not something that anyone without sophisticated knowledge of your source code can do, but, on the plus side, it’s still just a change to a setting in production.

Someone changes source code that does not involve re-compilation

This is what happens when a developer logs on to the web server and stars editing HTML or CSS or even a server side script like PHP in the files. This is really not a good idea for a variety of reasons, but it is possible and may be something you have to do in a pinch, so it’s worth noting.

Someone makes a code change that more or less just involves adding code and very little modification

Now we’re down deep enough into rigid territory that a new deployment/install is required in order to push the changes. From here on, this cannot be done in production, so if you’re doing this you’re going to incur all of the overhead of building/running automated tests (hopefully), quality assurance, creating a deployment and deploying (your process may vary). But on the plus side, this is pretty low risk as far as code changes go. Adding things is generally both easy to verify for correctness of functionality and unlikely to mess up existing code.

Someone makes a code change that involves lightweight changes to existing code.

The most common scenario here is probably a bug fix, though it may be new functionality too, depending on how flexible your architecture is. This is a higher risk proposition than adding new source code because you’re now creating a risk of regressions. You still have all of the same considerations about build and deployment, but the risk of problems is higher. Your testing and verification overhead should also be higher. This is a heavier change.

Significant work on the code base is required.

This is what happens when a code base that models a company and implements Office as a singleton suddenly needs to accommodate the new office you opened up in Texas. You designed your code under the assumption that there could only ever be one office location, and you were right about that — right up until you weren’t. Oops. Now things get ugly because management comes to you and says, “we’re opening a new office, so the application is going to need to handle that” and your response is, “that’s completely out of the question. Why, even the thought is preposterous!” You tell them that substantial rework is going to be required.

Making Sense of your Options

So why did I list all of these out? Well, I did it because I feel it’s important to know what your options are when you’re designing and that it’s important to anticipate, rather than react knee-jerk style, to changes. If you sit down before you start putting together a code base, think about what users might want and then go through the exercise of figuring out which number of change would be required. If you find likely changes that would be 6s or 7s (and there is certainly a sliding scale at the 6-7 level), that’s a problem that you should start addressing now. If you find extremely unlikely changes that are 1s and 2s, that’s not necessarily a problem. But it is a point of design flexibility that you could get away from, and it may be that you have pointless abstractions and complexity (though I’d be a lot more hesitant to introduce rigidity because you think flexibility is unneeded than vice-versa).

Another interesting exercise is to consider categories of these things. For instance, 5-7 are all things that require compiled code changes and 1-4 are all things that do not. This is an interesting way to split up your functionality, and it’s obviously the backbone of this post, but you can divvy these up in other ways as well. For instance, if you’re writing software that for some reason has no field or ops support, then 1, 5, 6 and 7 are your options, and 2-4 are basically non-starters. Or, if you’re considering things in which source control is an issue, then 4-7 are in a category and 1-3 are in a different category (most likely, as I’d think that you’d favor generating meta-data files as part of your deployment rather than source controlling different configurations).

None of this is even remotely comprehensive, but my goal here is really just to encourage people to understand at design time the difficulty of changing something at production time. It seems quite often to be the case that people don’t really think about this, and simply because no one has ever pointed it out to them. Your mileage may vary on the number of categories in the list and your preference for certain options, but at the core of this is a basic and incredibly important idea: you should always play “what if” when it comes to changes that users might request and understand how much of a headache it will be for you if the “what if” comes true. Oh, and also try to minimize the number of headaches. But hopefully that goes without saying.

By

Throw Out Your Code

Weird as it is, here’s human nature at work. Let’s say that I have a cheeseburger and you’re hungry. I tell you that I’ll sell you the cheeseburger for $10. You say, “pff, no way — too expensive.” Oh well, I eat the cheeseburger and call it a day. But I’ve learned my lesson. The next day at lunch, to execute my master cheeseburger selling plan, I slide the cheeseburger over in front of you and tell you that you can have it: “you can have this cheeseburger…” Just as you’re about to take a bite, however, I cruelly say “…for ten dollars!” You grumble, get out your wallet and hand me a ten dollar bill.

This is called “The Endowment Effect,” and it’s a human cognitive bias that causes us to value what we have disproportionately. I blogged about it here previously in the context of why we think that our code is so good we should SPAM it all over the place with control-V. But even if you don’t do that (and, really, please don’t do that), you still probably get overly attached to your code. I do. After all, we, as humans, have a hard time defying our own natural instincts.

I’m certainly no anthropologist, but I suspect that our ancestry as nervous, opportunistic scavengers on the African Serengeti has everything to do with this. Going and snatching a morsel that a hungry lion is eyeing is a pretty bad idea. But if you already have the morsel, what the hey, you might as well take it with you as you run away. But, however we’re wired, we’re capable of learning and conditioning our own responses. After all, we don’t go bolting away from the deli counter after the guy there hands us our two pounds of salmon. We’ve learned that this is a consequence-free transaction.

It’s time to teach yourself that lesson as it relates to your code. It’s not so much that deleting functional code is consequence free (it isn’t). But deleting it isn’t nearly as big of a deal as you probably think it is. When it comes to code that you’ve spent two weeks writing, I’m pretty willing to bet that if you trashed it all and started from scratch (no peeking at source control history), you could rewrite it all in about two days. If that sounds crazy, ask yourself whether the majority of the time you spend programming is spent furiously typing as if you were taking a words-per-minute test or if most of it is spent drawing things on scratch-paper, squinting at your screen, pushing code around unit tests, muttering to yourself, and tapping a pen on your desk. I’m betting it’s the latter, and, when you rewrite, it’s activities from the latter that you don’t do nearly as much. You’ve already blazed a trail for yourself and now you’re just breezing through for a second trip.

Write some code and throw it out. Do a code kata with the stipulation that the code is deleted, never to be recovered. Then try it again the next day and the day after that. Or create a copy of your production code at work, engage in some massive, high-risk, high-wire-act refactoring, and then just delete it. With either of these things, I promise you that you’ll learn a lot about efficient coding and your code base, respectively. But you’ll also learn a subtle lesson: the value you’re creating as you code can be found more in the knowledge and experience you’re acquiring as you do it than the bits sitting in source control.

Practice throwing out your code so that you stop neurotically overvaluing it. Practice throwing out your code because it’ll probably happen by accident at some point anyway. Practice throwing out your code because your first crack at things usually kind of sucks. And practice throwing out your code because end users and the world are cruel, and not everything that you write is going to make it gift-wrapped into production. The more you learn to let go, the happier and more productive you’re going to be as a programmer.

By

Easy Deployment: the Alpha and the Omega

A bit of housekeeping…you may have noticed that the social media buttons look a bit different if you’re not accessing through RSS. The old plugin that I was using seems not to be supported anymore, and the Facebook button vanished for a bit. I tried out a replacement and liked it, so I kept it. My thanks to Active Bits for the Social Sharing Toolkit.

Wrong But Fast

There’s a pretty good chance that your deployment process is both too painful and not painful enough. But before I return to that cryptic statement, let me talk a bit about something I’ve observed in developers — especially ones that are newer to the industry. Here’s an example of a series of exchanges that has become pretty familiar to me:

User: It would be nice if the profile screen had a way I could change my password.

Young Buck: That’ll take like, literally, two seconds. I’ll be right back!

(Fifteen minutes later)

Young Buck: Okay, I pushed it out to the server, so you can change your password now.

User: Wow! It’s live already? That’s really cool! Thank you! Let me try it out. Let’s see… oh, hmmm. When I try to log in it looks like it crashes.

Young Buck: That doesn’t seem right. I mean, the only thing I changed was…. oh! I know exactly what happened. Give me three minutes, and I’ll be back.

(30 minutes later)

Young Buck: Alright, should be good.

User: Well, I can get in again, and there’s the change password button, but when I click, nothing happens.

Young Buck: That’s not possible. You must have forgotten to clear your cache. Unless…wait, I think I know what’s going on here. Give me 10 minutes.

User: Uh, tell ya what — just don’t worry about it. Maybe I’ll try it again next week.

Young Buck: (Crestfallen) But it’ll only take 10 minutes and I know exactly what the problem is.

User: (With pity) I’m sure you do, but I’ve got a lot of things to get done today.

This is not professional behavior. Imagine if you took your car in for repairs somewhere and things went this way. You’d probably have the Better Business Bureau on the phone shortly or at least be headed to another mechanic. The young buck is sloppy because he’s brash and arrogant. Right?

Or does it just come off that way a little because of how sure he seems when really he’s just eager to please the user and prove himself? Personally, I find that in the overwhelming majority of cases this is really what’s going on. People often get into programming because they like solving problems. And many programmers were some of the smartest kids in their classes growing up — the ones waving their hands frantically, demanding that the teacher give them a chance to show that they know the answer.

FranticStudent

As entry level programmers, the school game has been all that they know. It’s a balance between rushing to get the right answer (teacher calling on students, timed exams, cramming in homework, etc.) and getting answers right, with the former often winning. In college, most programming assignments are evaluated by programs that allow you to submit your code as often as you want and get immediate feedback. There are also office hours, so students who visit professors and TAs the most frequently and submit the most work tend to get the best grades and the most feedback. Computer Science students are used to a paradigm where ideas are valued over execution.

Welcome to the Real World, Grasshopper

In the professional world, however, execution reigns supreme. Ideas are cheap. You may be able to rattle off the quick-sort algorithm in pseudo-code faster than anyone around you, but that’s not going to win you any startup capital. With software, even the intellectual property system (USPTO, anyway) is a joke seemingly designed to let Apple, Microsoft and Google endlessly sue each other and occasionally to swat little guys like bugs. Having the idea first and/or quickly is not as important as getting the idea right in the end.

It takes people some time to learn this upon entering the work force, and exchanges like the one I mention above are common. Users more or less say, “Go away and come back to me when you have something that makes my life better and not a second before. I don’t care if you thought of it in five seconds or if it took you five minutes or if it would have taken you five minutes except that the database was actually not normalized to BCNF and blah, blah, blah. Whatever.” Some people figure this out quickly, and some never figure it out. But whatever the speed may be and however much your group may or may not have come to terms with this, there needs to be structure in place to stop the madness.

In other words, a lot of the developers in your group are going to be eager to please. This is especially true if they regularly interact with their users. There is going to be pressure on them to say things like, “sure, I’ll have that for you in 10 minutes.” But they need not to say things like that. If they can say things like that and they can successfully (attempt to) make them happen, your deployment process is not painful enough.

Hurts so Good

Deployment is not to be taken lightly, especially if there is a release and the users are going to be seeing the result of the work. If you can deploy effortlessly in minutes, there’s a very good chance your process is not painful enough. The situation I’ve been describing above suffers from this very problem — it’s too easy for eager crowd-pleasers to deploy and thus it’s too easy for them to depend on users to be their fast feedback mechanism.

Developers are smart and often opportunistic. Getting fast feedback is extremely important to them, and they’ll naturally seek out ways to procure it. If you let them, they’ll use end users as their feedback mechanism (and, in a tone-deaf sense that ignores end-user perception, this is actually optimal), but you can’t permit this. Rather than following that path of least resistance, or at least familiarity from school/hobbyist days, you need to choke off that path and force them to carve a new riverbed. You need to make deployment more painful.

Now there’s “antiseptic on a cut” painful and “shark gnawing on your leg” painful. You want to gun for the former. A lot of deployment processes that enable developers to SPAM end users with non-functional updates are the product of amateur hour: xcopying files to the server, putting an executable on a share drive, zipping things up and emailing them, etc. These things tend to be both easy to do and easy to botch, so simply setting policies in place that prevent developers from doing them is antiseptic on the cut.

Deployments and especially releases to end users need to follow some process, and coding is simply one stop along the way — not the entirety of it. Ideally, there should be the coding and then developer testing, but from there, automated unit and integration tests, code reviews, static analysis, exploratory/manual testing by QA and observation by a UX group can all be part of the mix. These good practices serve to improve quality in and of themselves, but they also serve to prevent spurious, sloppy releases. If you know you can make a change in five minutes and have it in front of a user in six minutes, you’re going to do that. But if you know that you can make the change in five minutes and it’ll be days of going through the entire release process before you hear back from the end user, you’ll start finding other ways to get fast feedback (such as running unit tests, asking other developers to take a look and working closely with QA). By making deployment more painful you ensure that a lot more care goes into it.

But Hurts Too Much

If you’ve been reading along and grinding your teeth in objection to my premise that your deployment process needs to hurt more, I understand. Things shouldn’t be painful, and deliberately hurting yourself is a form of madness. If your deployment process hurts, it’s too painful, even though I just told you it’s not painful enough. But you have to prevent pain in the right way. Xcopy deployment is like being a boxer addicted to morphine — your process is horrible but pain free. Now imagine that you realize that being addicted to drugs is a problem, so you cut back and start feeling the pain each time a professional puncher unloads on you. In one sense, you’re feeling too much pain because it hurts to get punched in the face, but at the same time you’re not feeling enough pain because the amount you’re feeling isn’t causing you to consider another vocation.

The analogy here may not line up exactly, but the idea is similar. A lot of development and deployment processes are problematically painful in that they’re error-prone and difficult, but not painful enough in that they don’t prevent over-eager deploys and bad decisions. The solution? Get off the junk and stop letting yourself get pummeled by human wrecking balls. Or, in software terms, have a process that makes bad deployments hard and good ones very easy.

Now, getting to this release nirvana is not, itself, simple, but life is simple once you get there. The path to it generally involves a lot of automated testing and good planning. It involves a predictable release cadence, such as a sprint, and a commitment to following the process, not making exceptions nor cramming things in at the 11th hour or pushing back release dates. It involves continuous integration rather than periodic, nightmarish feature branch integrations. It involves a resistance to patching frantically when you make mistakes (you’ll learn a valuable lesson for next time). It involves a simple, fully-automated, easily repeatable build process. It involves a single click/button push deployment process. Summed up, it means that every time someone on your team checks in code, a series of automated tests and checks ensure that the code would be a good candidate for production or it is rejected from checkin until it would be. It means that your code could be shipped with a reasonably high degree of confidence on every checkin and that whether or not to actually ship is a business decision — not a technical one.

I encourage you to take a look at your build process and deployment process. Is it easy because Jim in accounting could do it? If so, it’s too easy and it’s definitely causing you problems. Is it hard enough that you do a lot of checking beforehand because you won’t want to do it again if things go wrong? That’s an improvement because it forces your hand for early testing and vetting, but it’s still a time-wasting problem. First, think about putting obstacles in place to guard against careless deployment, then think about refining those obstacles into process-helping practices, and finally, think about smoothing over the obstacles in the form of complete automation, leaving nothing but good, easy process.

By

Intro to Unit Testing 5: Invading Legacy Code in the Name of Testability

If, in the movie Braveheart, the Scots had been battling a nasty legacy code base instead of the English under Edward Longshanks, the conversation after the battle at Stirling between Wallace and minor Scottish noble MacClannough might have gone like this:

Wallace: We have prevented new bugs in the code base by adding new unit tests for all new code, but bugs will still happen.

MacClannough: What will you do?

Wallace: I will invade the legacy code, and defeat the bugs on their own ground.

MacClannough (snorts in disbelief): Invade? That’s impossible.

Wallace: Why? Why is that impossible? You’re so concerned with squabbling over the best process for handling endless defects that you’ve missed your God-given right to something better.

Braveheart

Goofy as the introduction to this chapter of the series may be, there’s a point here: while unit testing brand new classes that you add to the code base is a victory and brings benefit, to reap the real game-changing rewards you have to be a bit of a rabble-rouser. You can’t just leave that festering mass of legacy code as it is, or it will generate defects even without you touching it. Others may scoff or even outright oppose your efforts, but you’ve got to get that legacy code under test at some point or it will dominate your project and give you unending headaches.

So far in this series, I’ve covered the basics of unit testing, when to do it, and when it might be too daunting. Most recently, I talked about how to design new code to make it testable. This time, I’m going to talk about how to wrangle your existing mess to start making it testable.

Easy Does It

A quick word of caution here before going any further: don’t try to do too much all at once. Your first task after reading the rest of this post should be selecting something small in your code base to try it on if you want to target production and get it approved by an architect or lead, if that’s required. Another option is just to create a playpen version of your codebase to throw away and thus earn yourself a bit more latitude, but either way, I’d advise small, manageable stabs before really bearing down. What specifically you try to do is up to you, but I think it’s worth proceeding slowly and steadily. I’m all about incremental improvement in things that I do.

Also, at the end of this post I’ll offer some further reading that I highly recommend. And, in fact, I recommend reading it before or as you get started working your legacy code toward testability. These books will be a great help and will delve much further into the subjects that I’ll cover here.

Test What You Can

Perhaps this goes without saying, but let’s just say it anyway to be thorough. There will be stuff in the legacy code base you can test. You’ll find the odd class with few dependencies or a method dangling off somewhere that, for a refreshing change, doesn’t reference some giant singleton. So your first task there is writing tests for that code.

But there’s a way to do this and a way not to do this. The way to do it is to write what’s known as characterization tests that simply document the behavior of the existing system. The way not to do this is to introduce ‘corrections’ and cleanup as you go. The linked post goes into more detail, but suffice it to say that modifying untested legacy code is like playing Jenga — you never really know ahead of time which brick removal is going to cause an avalanche of problems. That’s why legacy code is so hard to change and so unpleasant to work with. Adding tests is like adding little warnings that say, “dude, not that brick!!!” So while the tower may be faulty and leaning and of shoddy construction, it is standing and you don’t want to go changing things without putting your warning system in place.

So, long story short, don’t modify — just write tests. Even if a method tells you that it adds two integers and what it really does is divide one by the other, just write a passing test for it. Do not ‘fix’ it (that’ll come later when your tests help you understand the system and renaming the method is a more attractive option). Iterate through your code base and do it everywhere you can. If you can instantiate the class to get to the method you want to test and then write asserts about it (bearing in mind the testability problems I’ve covered like GUI, static state, threading, etc), do it. Move on to the next step once you’ve done the easy stuff everywhere. After all, this is easy practice and practice helps.

Go searching for extractable code

Now that you have a pretty good handle on writing testable code as you’re adding it to the code base and getting untested but testable code under test, it’s time to start chipping away at the rest. One of the easiest ways to do this is to hunt down methods in your code base that you can’t test but not because of the contents in them. Here are two examples that come to mind:

The first class is untestable because you can’t instantiate it without kicking off global state modification and who knows what else. But the AddTwoNumbers method is imminently testable if you could remove that roadblock. In the second example, the AddTwoNumbers method is testable once again, in theory, but with a roadblock: it’s not public.

In both cases, we have a simple solution: move the method somewhere else. Let’s put it into a class called “BasicArithmeticPerformer” as shown below. I do realize that there are other solutions to make these methods testable, but we’ll talk about them later. And I’ll tell you what I consider to be a terrible solution to one of the testability issues that I’ll talk about now: making the private method public or rigging up your test runner with gimmicks to allow testing of private methods. You’re creating an observer effect with testing when you do this — altering the way the code would look so that you can test it. Don’t compromise your encapsulation design to make things testable. If you find yourself wanting to test what’s going on in private methods, that’s a strong, strong indicator that you’re trying to test the wrong thing or that you have a design flaw.

Now that’s a testable class. So what do the other classes now look like?

Yep, it’s that simple. In fact, it has to be that simple. Modifying this untestable legacy code is like walking a high-wire without a safety net, so you have to change as little as possible. Extracting a method to another class is very low risk as far as refactorings go since the most likely problem that could possibly occur (particularly if using an automated tool) is non-compiling. There’s always a risk, but getting legacy code under test is lower risk in the long run than allowing it to continue rotting and the risk of this particular approach is minimal.

On the other side of things, is this a significant win? I would say so. Even ignoring the eliminated duplication, you now have gone from 0 test coverage to 50% in these classes. Test coverage is not a goal in and of itself, but you can now rest a little easier knowing that you have a change warning system in place for half of your code. If someone comes along later and says, “oh, I’ll just change that plus to a minus so that I can ‘reuse’ this method for my purposes,” you’ll have something in place that will throw up a bid red X and say, “hey, you’re breaking things!” And besides, Rome wasn’t built in a day — you’re going to be going through your code base building up a test suite one action like this at a time.

Code that refers to no class fields is easy when it comes to extracting functionality to a safe, testable location. But what if there is instance-level state in the mix? For example…

That’s a little tougher because we can’t just pull _someField into a new, testable class. But what if we made a quick change that got us onto more familiar ground? Such as…

Aha! This looks familiar, and I think we know how to get a testable method out of this thing now. In general, when you have class fields or local variables, those are going to become arguments to methods and/or constructors of the new, testable class that you’re creating and instantiating. Understand going in that the more local variables and class fields you have to deal with, the more of a testing headache the thing you’re extracting is going to be. As you go, you’ll learn to look for code in legacy classes that refers to comparably few local variables and especially fields in the current class as a refactoring target, but this is an acquired knack.

The reason this is not especially trivial is that we’re nibbling here at an idea in static analysis of object oriented programs called “cohesion.” Cohesion, explained informally, is the idea that units of code that you find together belong together. For example, a Car class with an instance field called Engine and three methods, StartEngine(), StopEngine( )and RestartEngine() is highly cohesive. All of its methods operate on its field. A class called Car that has an Engine field and a Dishwasher field and two methods, StartEngine() and EmptyDiswasher() is not cohesive. When you go sniping for testable code that you can move to other classes, what you’re really looking for is low cohesion additions to existing classes. Perhaps some class has a method that refers to no instance variables, meaning you could really put it anywhere. Or, perhaps you find a class with three methods that refer to a single instance variable that none of the other 40 methods in a class refer to because they all use some other fields on the class. Those three methods and the field they use could definitely go in another class that you could make testable.

When refactoring toward testability, non-cohesive code is the low-hanging fruit that you’re looking for. If it seems strange that poorly designed code (and non-cohesive code is a characteristic of poor design) offers ripe refactoring opportunities, we’re just making lemonade out of lemons. The fact that someone slammed unrelated pieces of code together to create a franken-class just means that you’re going to have that much easier of a time pulling them apart where they belong.

Realize that Giant Methods are Begging to be Classes

It’s getting less and less common these days, but do you ever see object-oriented code which you can tell that the author meandered his way over to from writing C back in the one-pass compiler days? If you don’t know what I mean, it’s code that has this sort of form:

C programmers wrote code like this because in old standards of C it was necessary to declare variables right after the opening brace of a scope before you started doing things like assignment and control flow statements. They’ve carried it forward over the years because, well, old habits die hard. Interestingly, they’re actually doing you a favor. Here’s why.

When looking at a method like this, you know you’re in for doozy. If it has this many local variables, it’s going to be long, convoluted and painful. In the C# world, it probably has regions in it that divide up the different responsibilities of the method. This is also a problem, but a lemons-to-lemonade opportunity for us. The reason is that these C-style programmers are actually telling you how to turn their giant, unwieldy method into a class. All of those variables at the top? Those are your class fields. All of those regions (or comments in languages that don’t support regioning)? Method names.

In one of the resources I’ll recommend, “Uncle” Bob Martin said something along the lines of “large methods are where classes go to hide.” What this means is that when you encounter some gigantic method that spans dozens or hundreds of lines, what you really have is something that should be a class. It’s functionality that has grown too big for a method. So what do you do? Well, you create a new class with its local variables as fields, its region names/comments as method titles, and class fields as dependencies, and you delegate the responsibility.

In this example, there are no fields in the untestable class that the method is using, but if there were, one way to handle this is to pass them into the constructor of the extracted class and have them as fields there as well. So, assuming this extraction goes smoothly (and it might not be that easy if the giant method has a lot of temporal coupling, resulting from, say, recycled variables), what is gained here? Well, first of all, you’ve slain a giant method, which will inevitably be good from a design perspective. But what about testability?

In this case, it’s possible that you still won’t have testable methods, but it’s likely that you will. The original gigantic method wasn’t testable. They never are. There’s really way too much going on in them for meaningful testing to occur — too many control flow statements, loops, global variables, file I/O, etc. Giant methods are giant because they do a lot of things, and if you do enough code things you’re going to start running over the bounds of testability. But the new methods are going to be split up and more focused and there’s a good chance that at least one of them will be testable in a meaningful way. Plus, with the extracted class, you have control over the new constructor that you’re creating whereas you didn’t with the legacy class, so you can ensure that the class can at least be instantiated. At the end of the day, you’re improving the design and introducing a seam that you can get at for testing.

Ask for your dependencies — don’t declare them

Another change you can make that may be relatively straightforward is to move dependencies out of the scope of your class — especially icky dependencies. Take a look at the original version of Untestable3 again.

When instantiated, this class goes and rattles some global state cages, doing God-knows-what (icky), and then retrieves something from global state (icky). We want to get a test around the AddToGlobal method, but we can’t instantiate this class. For all we know, to get the value of “someField” the singleton gets the British Prime Minster on the phone and asks him for a random number between 1 and 1000 — and we can’t automate that in a test suite. Now, the earlier option of extracting code is, of course, viable, but we also have the option of punting the offending code out of this class. (This may or may not be practical depending on where and how this class is used, but let’s assume it is). Say there’s only one client of this code:

All we really want out of the constructor is a value for “_someField”. All of that stuff with the singleton is just noise. Because of the nature of global variables, we can do the stuff Untestable3’s constructor was doing anywhere. So what about this as an alternative?

This new code is going to do the same thing as the old code, but with one important difference: Untestable3 is now a liar. It’s a liar because it’s testable. There’s nothing about global state in there at all. It just takes an integer and stores it, which is no problem to test. You’re an old pro by now at unit testing that’s this easy.

When it comes to testability, the new operator and global state are your enemies. If you have code that makes use of these things, you need to punt. Punt those things out of your code by doing what we did here: executing voids before your constructors/methods are called and asking for things returned from global state or new in your constructors/methods. This is another pretty low-impact way of altering a given class to make it testable, particularly when the only problem is that a class is instantiating untestable classes or reaching out into the global state.

Ruthlessly Eliminate Law of Demeter Violations

If you’re not familiar with the idea, the Law of Demeter, or Principle of Least Knowledge, basically demands that methods refer to as few object instances as possible in order to do their work. You can look at the link for more specifics on what exactly this “law” says, and what exactly is and is not a violation, but the most common form you’ll see is strings of dots (or arrows in C++) where you’re walking an object graph: Property.NestedProperty.NestedNestedProperty.You.Get.The.Idea. (It is worth mentioning that the existence of multiple dots is not always a violation of the Law of Demeter — fluent interfaces in general and Linq in the C# world specifically are counterexamples). It’s when you’re given some object instance and you go picking through its innards to find what you’re looking for.

One of the most immediately memorable ways of thinking about why this is problematic is to consider what happens when you’re at the grocery store buying groceries. When the clerk tells you that the total is $86.28, you swipe your Visa. What you don’t do is wordlessly hand him your wallet. What you definitely don’t do is take off your pants and hand those over so that he can find your wallet. Consider the following code, bearing in mind that example:

The method in this class just prepends an explanatory string to a social security number. So why on earth do I need something called a customer order? That’s crazy — as crazy as handing the store clerk your pants. And from a testing perspective, this is a real headache. In order to test this method, I have to create a customer, then create an order and hand that to the customer, then create a personal info object and hand that to the customer’s order, and then create an SSN and hand that to the customer’s order’s personal info. And that’s if everything goes well. What if one of those classes — say, Customer — invokes a singleton in its constructor. Well, now I can’t test the “PrepareSsnMessage” in HardToTest because the Customer class uses a singleton. That’s absolutely insane.

Let’s try this instead:

Ah, now that’s easy to test. And we can test it even if the Customer class is doing weird, untestable things because those things aren’t our problem. What about clients, though? They’re used to passing customer orders in, not SSNs. Well, tough — we’re making this class testable. They know about customer order and they its SSN, so let them incur the Law of Demeter violation and figure out how to clean it up. You can only make your code testable one class at a time. That class and its Law of Demeter violation is tomorrow’s project.

When it comes to testing, the more stuff your code knows about, the more setup and potential problems you have. If you don’t test your code, it’s easy to write train wrecks like the “before” method in this section without really considering the ramifications of what you’re doing. The unit tests force you to think about it — “man, this method is a huge hassle to test because problems in classes I don’t even care about are preventing me from testing!” Guess what. That’s a design smell. Problems in weird classes you don’t care about aren’t just impacting your tests — they’re also impacting your class under test, in production, when things go wrong and when you’re trying to debug.

Understand the significance of polymorphism for testing

I’ll leave off with a segue into the next chapter in the series, which is going to be about a concept called “test doubles.” I will explain that concept then and address a significant barrier that you’re probably starting to bump into in your testing travels. But that isn’t my purpose here. For now I’ll just say that you should understand the attraction of using polymorphic code for testing.

Consider the following code:

Here you have a class, CustomerPropertyFormatter, that should be pretty easy to test. I mean, it just takes a customer and accesses some string property on it for formatting purposes. But when you actually write a test for this, everything goes wrong. You create a customer to give to your method and your test blows up because of singletons and databases and whatnot. You can write a test with a null argument and amend this code to handle null gracefully, but that’s about it.

But, never fear — polymorphism to the rescue. If you make a relatively small modification to the Customer class, you set yourself up nicely. All you have to do is make the FirstName property virtual. Once you’ve done that, here’s a unit test that you can write:

Notice that there is a class, DummyCustomer declared inside of the test class that inherits from the Customer class. DummyCustomer is an example of a test double. You’ll notice that I’ve created a scenario here where I define a version of FirstName that I can control — a benign version, if you will. I effectively bypass that database-singleton thing and create a version of the class that exists only in the test project and allows me to substitute a simple, friendly value that I can test against.

As I said, I’ll dive much more into test doubles next time, but for the time being, understand the power of polymorphism for testability. If the legacy code has methods in it that are hard to use, you can create much more testable situations by the use of interface implementation, inheritance, and the virtual keyword. Conversely, you can make testing a nightmare by using keywords like final and sealed (Java and C# respectively). There are valid reasons to use these, but if you want a testable code base, you should favor liberal support of inheritance and interface implementation.

A Note of Caution

In the sections above, I’ve talked about refactorings that you can do on legacy code bases and mentioned that there is some risk associated with doing so. It is up to you to assess the level of risk of touching your legacy code, but know that any changes you make to legacy code without first instrumenting unit tests can be breaking changes, even small ones guided by automated refactoring tools. There are ways to ‘cheat’ and tips and techniques to get a method under test before you refactor it, such as temporarily making private fields public or local variables into public fields. The Michael Feathers book below talks extensively about these techniques to truly minimize the risk.

The techniques that I’m suggesting here would be ones that I’d typically undertake when requirements changes or bugs were forcing me to make a bunch of changes to the legacy code anyway, and the business understood and was willing to undertake the risk of changing it. I tend to refactor opportunistically like that. What you do is really up to your discretion, but I don’t want to be responsible for you doing some rogue refactoring and torpedoing your production code because you thought it was safe. Changing untested legacy code is never safe, and it’s important for you to understand the risks.

More Information

As mentioned earlier, here are some excellent resources for more information on working with and testing legacy code bases:

By

Slow Posting Week

This is just a brief announcement that I’m probably not going to be posting this coming week — I forgot to mention it last week. Readers in the USA and perhaps others know that Thursday was US Independence Day, so between the long weekend for me and the fact that I’m currently out of town (pleasure, not business) until late next week, I’m not really planning to write any blog posts. All I have with me are an Android tablet and an Ubuntu Netbook Remix machine, neither of which has a serious IDE, so if I do get time it’ll probably be philosophical more than technical.

Assuming that the blog is quiet this coming week, there is still stuff to look forward to. I’m continuing to write for the unit testing series, and the Expert Beginner E-Book is in its last phase before release — illustrations and format cleanup. So stay tuned for those and other random thoughts.

By

Proposal: A Law of Performance Citation

I anticipate this post being fairly controversial, though that’s not my intention. I imagine that if it wanders its way onto r/programming it will receive a lot of votes and zero points as supporters and detractors engage in a furious, evenly-matched arm-wrestling standoff with upvotes and downvotes. Or maybe three people will read this and none of them will care. It turns out that I’m actually terrible at predicting which posts will be popular and/or high-traffic. And I’ll try to avoid couching this as flame-bait because I think I actually have a fairly non-controversial point on a potentially controversial subject.

ArmWrestling

To get right down to it, the Law of Performance Citation that I propose is this:

If you’re going to cite performance considerations as the reason your code looks the way it does, you need to justify it by describing how a stakeholder will be affected.

By way of an example, consider a situation I encountered some years back. I was pitching in to help out with a bit of programming for someone when I was light on work, and the task I was given amounted to “copy-paste-and-adjust-to-taste.” This was the first red flag, but hey, not my project or decision, so I took the “template code” I was given and made the best of it. The author gave me code containing, among other things, a method that looked more or less like this (obfuscated and simplified for example purposes):

I promptly changed it to one that looked like this for my version of the implementation:

I checked this in as my new code (I wasn’t changing his existing code) and thought, “he’ll probably see this and retrofit it to his old stuff once he sees how cool the functional/Linq approach is.” I had flattened a bunch of clunky looping logic into a compact, highly-readable method, and I found this code to be much easier to reason about and understand. But I turned out to be wrong about his reaction.

When I checked on the code the next day, I saw that my version had been replaced by a version that mirrored the original one and didn’t take advantage of even the keyword foreach, to say nothing of Linq. Bemused, I asked my colleague what had prompted this change and he told me that it was important not to process the foos in the collection a second time if it wasn’t necessary and that my code was inefficient. He also told me, for good measure, that I shouldn’t use var because “strong typing is better.”

I stifled a chuckle and ignored the var comment and went back to look at the code in more detail, fearful that I’d missed something. But no, not really. The method about reading from a file read in the entire foo collection from the file (this method was in another assembly and not mine to modify anyway), and the average number of foos was single digits. The foos were pretty lightweight objects once read in, and the methods evaluating them were minimal and straightforward.

Was this guy seriously suggesting that possibly walking an extra eight or nine foos in memory, worst case, sandwiched between a file read over the network and a database write over the network was a problem? Was he suggesting that it was worth a bunch of extra lines of confusing flag-based code? The answer, apparently, was “yes” and “yes.”

But actually, I don’t think there was an answer to either of those questions in reality because I strongly suspect that these questions never crossed his mind. I suspect that what happened instead was that he looked at the code, didn’t like that I had changed it, and looked quickly and superficially for a reason to revert it. I don’t think that during this ‘performance analysis’ any thought was given to how external I/O over a network was many orders of magnitude more expensive than the savings, much less any thought of a time trial or O-notation analysis of the code. It seemed more like hand-waving.

It’s an easy thing to do. I’ve seen it time and again throughout my career and in discussing code with others. People make vague, passing references to “performance considerations” and use these as justifications for code-related decisions. Performance and resource consumption are considerations that are very hard to reason about before run-time. If they weren’t, there wouldn’t be college-level discrete math courses dedicated to algorithm runtime analysis. And because it’s hard to reason about, it becomes so nuanced and subjective in these sorts of discussions that right and wrong are matters of opinion and it’s all really relative. Arguing about runtime performance is like arguing about which stocks are going to be profitable, who is going to win the next Super Bowl, or whether this is going to be a hot summer. Everyone is an expert and everyone has an opinion, but those opinions amount to guesses until actual events play out for observation.

Don’t get me wrong — I’m not saying that it isn’t possible to know by compile-time inspection whether a loop will terminate early or not, depending on the input. What I’m talking about is how code will run in complex environments with countless unpredictable factors and whether any of these considerations have an adverse impact on system stakeholders. For instance, in the example here, the (more compact, maintainable) code that I wrote appears that it will perform ever-so-slightly worse than the code it replaced. But no user will notice losing a few hundred nano-seconds between operations that each take seconds. And what’s going on under the hood? What optimizations and magic does the compiler perform on each of the pieces of code we write? What does the .NET framework do in terms of caching or optimization at runtime? How about the database or the file read/write API?

Can you honestly say that you know without a lot of research or without running the code and doing actual time trials? If you do, your knowledge is far more encyclopedic than mine and that of the overwhelming majority of programmers. But even if you say you do, I’d like to see some time trials just the same. No offense. And even time trials aren’t really sufficient because they might only demonstrate that your version of the code shaves a few microseconds off of a non-critical process running headlessly once a week somewhere. It’s for this reason that I feel like this ‘law’ that I’m proposing should be a thing.

Caveats

First off, I’m not saying that one shouldn’t bear efficiency in mind when coding or that one should deliberately write slow or inefficient code. What I’m really getting at here is that we should be writing clear, maintainable, communicative and, above all, correct code as a top priority. When those traits are established, we can worry about how the code runs — and only then if we can demonstrate that a user’s or stakeholder’s experience would be improved by worrying about it.

Secondly, I’m aware of the aphorism that “premature optimization is the root of all evil.” This is a little broader and less strident about avoiding optimization. (I’m not actually sure that I agree about premature optimization, and I’d probably opt for knowledge duplication in a system as the root of all evil, if I were picking one.) I’m talking about how one justifies code more than how one goes about writing it. I think it’s time for us to call people out (politely) when they wave off criticism about some gigantic, dense, flag-ridden method with assurances that it “performs better in production.” Prove it, and show me who benefits from it. Talk is cheap, and I can easily show you who loses when you write code like that (hint: any maintenance programmer, including you).

Finally, if you are citing performance reasons and you’re right, then please just take the time to explain the issue to those to whom you’re talking. This might include someone writing clean-looking but inefficient code or someone writing ugly, inefficient code. You can make a stakeholder-interest case, so please spend a few minutes doing it. People will learn something from you. And here’s a bit of subtlety: that case can include saying something like, “it won’t actually affect the users in this particular method, but this inefficient approach seems to be a pattern of yours and it may well affect stakeholders the next time you do it.” In my mind, correcting/pointing out an ipso facto inefficient programming practice of a colleague, like hand-writing bubble sorts everywhere, definitely has a business case.

By

I Don’t Know What X is on Line 48 And I Don’t Care

I was at a users’ group recently here in Chicago, and there were two excellent presenters. There were two very well done presentations: one about Xamarin and one about the C# language entitled “Underestimated C# Language Features” by John Michael Hauck. This was a very polished and appealing talk in which he wove together some important differentiators for the C# language, including closures, anonymous methods, and deferred execution. Given that he’s presented this at some conferences as well, I wasn’t surprised by the high caliber of the presentation.

The format was one in which he presented a series of increasingly difficult problems and asked what different variables were at different points of the program’s execution. For example, here’s something I just made up that would have looked at home in one of his many examples:

With each example like this, he’d ask the audience what they thought would be the output. And then the fun would begin. There’d always be several different answers and heated disagreement. After all, bragging rights and programmer cred was at stake. This was public gamification at its finest, and it reminded me vaguely of being in a sports bar and listening to people debating heatedly what the next play call would be in the Monday night football game:

Person 1: Third and 7. They have to pass!
Person 2: I think they’re going to call a draw.
Person 3: You’re nuts! That’s stupid! It’ll be a screen pass!
Person 1: Come on, you’ve been wrong every time!!!

FootballGame

I was entertained by this. John was an engaging presenter and the material interested me, but I discovered I had no interest in actually guessing, even to myself, to say nothing of out loud. At one point, a friend of mine that I was sitting with said, “who knows — write a unit test to figure it out,” and I certainly agreed with that point. But that wasn’t it. I think it was just that I was more interested in what the answer would teach me about the language and its nuance than I was in somehow testing myself. Frankly, this seemed like trivia to me and getting the right answer didn’t seem as important as taking the opportunity to hone my critical thinking skills.

Figuring this out, I realized it explained why I was impatient with all of the guessing that people were doing. John would ask people what they thought, and they would guess but then also start explaining their reasoning and arguing with one another. This struck me as boring and inefficient. “Just shut up and let him hit F10, and we’ll have the answer,” I thought. In that moment, I also realized that a very mild pet peeve of mine has always been code reviews or other places where people argue over runtime behavior. I think to myself, “You know what’s better than all of us at being the .NET runtime? The .NET runtime! So let’s ask it.”

After the presentation, I went out for a bite to eat with some friends. There, one of them made an excellent point. When we were discussing the idea of sitting around and figuring out what code did, he said (paraphrased), “if I show code to a handful of competent developers and they can’t agree on what it will do at runtime, then that code needs to be re-written.” I thought this was perfect and couldn’t agree more. The combination of letting the runtime stick to figuring out what things will be at runtime and the idea that well-written code shouldn’t make this a mystery really sort of drove home why this whole exercised seemed to me (and really is) purely an academic thought-drill.

By all means, exercise your brain and solve riddles involving programming. But I don’t think that this type of activity should be the centerpiece of work-place conversations, evaluations of code, or especially interviews. If I’m interviewing people and asking them what X is on line 48, before even the guy who gets it right, I’m going to hire the guy that says, “let’s write a unit test and find out.”

Acknowledgements | Contact | About | Social Media