DaedTech

Stories about Software

By

Please Don’t Recycle Local Variables

I think there’s a lot of value to the conservation angle of the green movement. In general, it’s a matter of efficiency–if you can heat/light/whatever your house with the same quality of life, using less energy and fewer resources, that’s a win for everyone. This applies to a whole lot of things beyond just eco-concerns, however. Conserving heat when you’re cold, conserving energy when you’re running a marathon, conserving your dollars when making a budget–all good ideas. Cut down, conserve, reuse when you can.

Recycle

Except please don’t do it with your local variables. For example:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Here, we initialize a local variable, count, and use it to keep track of the results of some processing of customers. When we’re done, we reset count and use it to keep track of the apparently unrelated concept of machines. What I’m saying is that there shouldn’t be just one count, but rather customerCount and machineCount.

Does this seem like nitpicking? You could certainly make that argument, but this code is not going to age well. First of all, this method should clearly be two methods, so we’re starting right off the bat with a bit of technical debt. It would be cleaner if each loop had its own method.

But an interesting thing happens if we use the refactoring tools to try to do that–the refactoring tool wants return values or input parameters. Yikes, that was unexpected, so we just move on. Later, when the time comes to iterate over movies, we see that there’s a ‘design pattern’ in place, so we modify the code to look like this:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var movie in Movies)
    {
        DoSomeStuffToMovie(movie);
        if (movie.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Now this thing should really be split up, so we start selecting parts of it to see what we can refactor. Ew, now we’re getting ref parameters to boot. This thing is getting even more painful to try to refactor, and we’re in a hurry, so no time for that. And to make matters worse, if you add in a few other aggregator variables this way, you’ll start to have all kinds of barriers in place when you want to pull this thing apart, such as crazy sets of out parameters. I’ve posted before about how I feel about ref and out.

All of this mounting technical debt could easily be avoided by giving each loop its own count variable. Having them recycle the same one creates a compile-time dependency of what’s going on in each loop with what happened in the loop before, even though there are no other similar dependencies in evidence. In other words, recycling this local variable is the only thing that’s creating a coupling in your code–there’s no logical reason to do it.

This is the height of procedural programming and baking in temporal dependencies that I cautioned you to avoid here. It’s a completely useless dependency that will inhibit refactoring and dirty up your code in a hurry. It may not seem like much yet, but this will be a huge pain point later as the lines of code in this method balloon from the dozens to the hundreds, and you rely heavily on automated tools to help with cleanup. Flag variables used over and over in sequence throughout a method are like pebbles in your shoe when you’re trying to refactor.

So my advice is to avoid this practice completely. There’s really no advantage to coding this way and the potential downside is enormous.

By

Introduction to Unit Testing Part 4: Design New Code For Testability

In the last post in this series, I covered what I think of as an important yet seldom discussed subject: how not to overwhelm yourself and get discouraged when you’re starting to unit test. In the post before that, I showed the basics of writing a unit test. But this leaves something of a gap. You can now write a unit test in a vacuum for an extremely simple class, and, when looking at your legacy code base, you know what to avoid. But you don’t necessarily how to write a non-trivial application with unit tests.

You might understand now how to write tests, assuming that you have some disconnected new class in your code base without dependencies and barriers to testing, but you wonder how to get to that point in the first place. I mean, your application doesn’t seem to need a prime number finder or a bowling score calculator. It needs you to add lines of code to existing methods or new methods to existing classes. It doesn’t really seem to need new classes, so the initial momentum and resolution you’ve built reading these first three posts to go and be a unit tester sort of fizzles anticlimactically when you stare at your code base.

What’s going on here?

Recognizing Inhospitable Terrain

The first thing to understand is that your code base probably wasn’t written with testability in mind. There’s nothing wrong with you for not being able to see where unit testing fits in, because it doesn’t. Last time I talked about things you’ll see that will torpedo your efforts to test a particular method or piece of code, but let me speak to some entire architectures and patterns that don’t lend themselves to testability. It’s not that code written with these technologies and patterns can’t be tested–it’s just that it won’t be easy for you. As you read through this, you might feel like I’ve read your code base’s mind.

  1. Active Record as an architectural pattern
    This is a pattern in which you create classes that are in-memory representations of database tables, views, or stored procedures. If you see in your code base a class called “Customer” that has methods like “GetById()”, “Update()” and “MoveNext()” you’ve got yourself an Active Record architecture. This architecture tightly couples your database to your domain logic and your domain logic to the rules for navigating through domain objects. You can’t test any of these objects since any operation you perform on them sends them scurrying off to create database connections and parameterized queries and all manner of other untestable stuff. And since decomposition and decoupling is the path toward unit testing, this sort of tight coupling of everything in your code is the path away from it.
  2. Winforms
    Winforms in the .NET world are tried and true when it comes to rapidly cranking out functional little applications, but you have to work really, really hard to make code that uses them testable. Q&A sites are littered with people trying to understand how to make Winforms testable, which should tell you that making them testable is not trivial. If you have Winforms and Active Record both in the same code base, at least the architecture is split into two concerns. But it’s split into two thoroughly untestable ones.
  3. Webforms
    See Winforms. Webforms is very similar in terms of framework testability, and for pretty similar reasons. Webforms is arguably even harder to test, however, because it’s predicated on spewing out reams and reams of HTML, CSS, and Javascript while allowing you to pretend you’re writing a desktop app. I’ve talked about my opinion of this technology before.
  4. Wizard/markup-reliant code
    Do you use the Webforms grid wizard thing to generate your grids? Do you define object data sources, such as DB connections or files, in the markup? Practices like these are the epitome of quick and dirty, rapid-prototyping implementations that hopelessly cross couple your applications beyond all testability. If this is something that’s done in your group/code base, testing is basically a non-starter until you go in a different architectural direction.
  5. Everything in your application is in a user control/form
    I’ve seen this called “Smart UI” and it basically means that there’s absolutely no separation of concerns in your code. The UI elements create database connections, write to files, implement business rules–they do everything. Code like this is impossible to unit test.

If any of this is sounding familiar, your task might be daunting. I have my own preferences, but I’m trying not to offer a value judgment here as much as I’m letting you know what you’re up against. I’m like a mortgage broker that’s saying to you, “if you want to own a home, that’s a great goal. But if you are eight months behind on your rent and have no personal savings, you’re going to have some work to do first.” If I’ve described your code base in the list above, you face different challenges than a green field developer. And since you’ve presumably been contributing to these code bases, you’re probably very used to implementation techniques that don’t result in testable code. You’re going to need to change your thinking and your coding practices in order to start writing testable code.

Once we’ve discussed how to get you writing testable code, I’ll come back to these macroscopic concerns and give some pointers for how to improve the situation. But, for now, on to a revised approach to coding. The following holds true whether you’re banging away at some legacy Winforms/Active Record application or starting a brand new MVC 4 site.

Add New Methods and Classes First, Ask Questions Later

First thing to abide by is to favor adding new things to the code over modifying existing things in the code. You may have heard this before in the context of the “Open/Closed Principle”, but that’s a guide for how to write your classes. (Basically, it admonishes you to write classes that others can extend and override rather than change.) I’m talking about how to deal with existing code bases. To put it simply and bluntly, it’s a lot easier to both code and test brand spankin’ new classes than to write and test changes to existing ones. We all know this. It’s at the heart of why we as developers always lean toward rewriting others’ code instead of understanding and working with it.

Now, this might not win you friends. In shops where people tend to write procedural code (you know, the kind you’re trying to get away from writing), they seem to have some weird fear of creating too many classes and masochistic attachment to monolithic structures. You might have to compromise or practice on your own if you run afoul of the project’s architect, but the exercise is invaluable. It’s going to propel you toward decoupling as a default rather than an exception. Doing new things? Time for a new class and some unit tests.

I know what you’re thinking: “But what if the thing that needs to be done has to be done in the middle of some method somewhere?” Well, instantiate your new class at that point and use it. “But what if it needs a bunch of fields from the class it’s in and variables from the method?” Pass them in through the constructor or method call. “But won’t that make my design bad?” It already is bad, but at least now you’re making part of it testable. When your code is testable and under test, everything is easier to fix later.

You’ll have to use some discretion, obviously, but shift your attitude here. Don’t look at a project that has nothing but .aspx files and their code behind and hide classes in there for fear of breaking with tradition. Boldly add pure .cs files to the code base. Add unit tests for those .cs files you’re creating. (Apologies to Java readers, but this has no real Java equivalent that I can think of having used. Plus, the Java stack in general seems not to have the same level of untestable cruft built into frameworks.) It’s a lot harder for someone, even the project architect, to give you a hard time if your new way of doing things is covered by unit tests. Even if they’re hostile Expert Beginners, they’re hard pressed not to sound silly if they say, “we don’t do that here.” You’ll at least have a better chance of pushing this change through with the unit tests than without them.

Ask Questions in the Right Order: What, How, When

Now that you’re creating a lot of new classes and instantiating them in the old untestable ones, it’s time to start working on what kind of code you write. If you’ve practiced and come back, I suspect that you’re starting to be able to write a few useful tests but are perhaps still struggling. And I bet it’s because the line between where the old class ends and your new one begins is a little hazy. Maybe they share some common fields. Maybe when you instantiate the class you’re testing, you hand it a “this” reference so that it can go picking through the properties on the untestable behemoth from which you’re escaping. This is the next thing we need to tighten up–stop doing that. A clear, concise division of labor between the classes is necessary, and it’s not possible if they share all of the same fields, properties, state, etc. That’s like a break up where you two continue to live together, share a car, and go to the movies on weekends.

The best way to achieve this clean split is with good abstraction, and the best way to do that is to remember “what, how, when.” When it’s time to change the code base, remember that you want to favor creating a new class. But before you do that, ask yourself “what?” but not the other two questions. What should I name this class? What should it do? What should it expose as its public methods? Don’t start thinking about how those methods or classes should work and don’t you dare start thinking about when anything at all should happen–just think about what. Give it a good name that defines a clear purpose, and then give it a good set of methods and properties that draw attention to why it’s a different concept than the class that will be using it. Ask yourself what the boundaries between the two classes will be so that you can minimize the amount of shared information. And now, start stubbing out those methods with no implementation and start stubbing out some test methods with names that say what the methods will do.

At this point, you’re ready for “how.” Start actually implementing the “what” and testing that your implementation works in the unit tests. Believe me, it’s much, much easier to implement methods this way. If you think about “what” and “how” at the same time, you start writing confusing code that not even you have faith in when you’re done. Implementation alone is much easier when you have a clear picture of “what,” and unit testing is a breeze.

Once everything is implemented, you can start thinking about “when.” When should you instantiate the class and when should you call its methods? But don’t spend too much time with “when” in your head because it’s dangerous. Does that sound weird? Let me explain.

“When Code” is Monolithic Code

Picture two methods. One is a 700 line juggernaut with more control flow statements than you can keep track of without a spreadsheet. The other is five lines long, consisting only of an initialize statement, a foreach, and a return statement. With which would you rather work? I imagine the response is unanimous here. Even if you tend to crank out these kinds of large methods, when you step through the debugger, looking for the cause of a bug and finding yourself in some huge method, your heart sinks and you settle in with snacks and caffeine because it’s going to be a long day.

Now with these two methods in mind, imagine if we were pair programming together and I simply asked “when?” With the tiny method, you’d probably say, “What do you mean by ‘when’–I mean, you initialize before the loop and you return when you find the record you’re looking for. What a weird question!” With the other method, you’d probably affect a thousand-yard stare and say, “Man, I don’t even know where to begin.” But the answer to that question would fill pages. Books. Because you set the first loop counter j equal to the third loop counter k about twenty lines before the fourth try-catch and fifteen lines before you set the middle loop counter j equal to four. Unless, of course, you threw that exception up on line 2090, in which case j might never have been initialized. Er, wait, I think that happened somewhere near the fifth while loop in that else condition up there. Oh, there’s so much “when,” but it’s all slammed together in a method where you can’t possibly test any of it. Lots of thinking about “when” breeds huge methods like a Petri dish for bacteria.

“When” code is procedural at its core, and procedural, “when” code is an anathema to object-oriented unit testing, which is all about “what” and “how.” Remember earlier in the series when I said that multi-threaded code was really, really hard to test? Well, that’s just a subset of an idea called “temporal coupling,” and what we’re talking about here also falls under that umbrella. Temporal coupling is what happens when things have to be executed in a specific order or else they do not work.

Imagine that you’re coding up a model for someone’s day. When you think of how to do this, do you think, “first he gets up, then he brushes his teeth, then he showers, then he puts on his clothes, then… then he comes home, then he eats dinner, then he watches TV, then he goes to bed?” Do you code this up with constructs like:

public void GoAboutMyDay()
{
    bool wokeUp = WakeUp();
    if (wokeUp == true)
    {
        bool showered = Shower();
        if (showered == true)
        {
            bool putOnClothes = PutOnClothes();
            if(putOnClothes)
                //You get the idea
        }
    }
}

This method is all about “when.” It’s entirely procedural, and it’s going to be horrifying when it’s complete. When you get into the fortieth nested if condition, maybe someone will come along and flatten it out with a bunch of inverted early returns. Or maybe not, because maybe some of if clauses start sprouting else conditions with loops in them. And maybe the methods being called start communicating with one another via boolean flag fields in the class. Who knows–this thing is on the precipice of becoming unstoppable. It might just achieve sentience at some point, so that when you try to start deleting conditionals, it says, “I can’t let you do that, Dave,” and puts them back.

The root problem behind it all is the “when” and the procedural thinking because you’re orienting the implementation around the order of the activities rather than the nature of the activities. Unit testing is all about deconstructing things into their smallest possible chunks and asserting things about those chunks. Temporal coupling and “when” logic is all about chaining and fusing things together.

If you were thinking about “what” first here, you would form a much different mental model of a person’s day. You’d say things to yourself like, “well, during the course of a person’s day, he probably wakes up, gets dressed, eats breakfast–well, actually eats one or more meals–maybe works if it’s a weekday, goes to bed at some point,” etc. Whereas in the procedural “when” modeling you were necessarily building a juggernaut method, here you’re dreaming up the names of methods and/or classes that can be unit tested separately and in isolation. It’s no reach to say, “okay, let’s have a Meal class that will have the following methods…”

Only at the end will you decide “when.” You’ll decide it after you’ve stubbed things out with the “what” and implemented/unit tested them with the “how.” “When” is a detail that you should allow yourself to figure out at any point down the line. If you nail down what and how, you will have testable, modular, and manageable code as you create your classes.

Other Design Considerations for Your New Classes

I’ll wrap up here with a few additional tips for creating testable designs when adding code-to-code bases:

  1. Avoid using fields to communicate between your methods by setting flags and tracking state. Favor having methods that can be executed at any time and in any order.
  2. Don’t instantiate things in your constructor. Favor passing them in (we’ll talk about this in detail in a future post in the series).
  3. Similarly, don’t have a lot of code or do a lot of work in your constructor. This will make your class painful to setup for test.
  4. In your methods, accept parameters that are as decomposed as possible. For instance, don’t accept a Customer object if all you do with it is read its SSN property. In that case, just ask for the SSN.
  5. Avoid writing public static methods. These are easy enough to test (often), but they start introducing testability problems when you write code that uses them. (This might be hard to swallow at first, but mull over the idea of simply not using static methods anymore.)
  6. The earlier you start writing your unit tests, the better. If you find that you’re having a hard time testing your new code, it’s more likely a problem with the code than with unit testing it, and if you write tests early, you’ll discover these problems before you get too far and fix them.

This post has covered ways to write unit tests “from here forward” and ways to stop adding untested code to code bases. In the next post, I’ll talk about how to start getting the legacy code under test.

Addendum: Mitigating the Hostile Test Environments

Finally, as promised, here are ways to accommodate testing in the less-than-ideal architectures mentioned above, if you’re curious or want to do some more research:

  1. Instead of Active Record, look at some kind of ORM solution like NHibernate or Entity Framework. These are tools that generate all of the code for you to access the database so that you don’t have to worry about testing that code and you can focus on writing only your (testable) domain code. Barring that, try to separate the three concerns of Active Record objects: modeling the database, connecting to the database, and modeling a domain object. The first concern adds no value, and the second two can be broken out into separate objects where the only thing hard to test is the actual database access.
  2. Instead of Winforms, favor WPF when possible. If that isn’t possible, see if you can use the Model-View-Presenter (MVP) pattern to move as much logic out of the untestable code-behind as possible.
  3. To be blunt, from a testing/decoupling perspective, Webforms is a disaster. You can have some limited success by adopting a more passive binding model and moving as much code out of the code-behind as possible, but it’s all pretty awkward. Webforms really seems more about rapid-prototyping and Microsoft-Accessing web development than producing scalable, sophisticated architectures.
  4. If you’re using wizards to generate your application’s architecture, cut it out. If you’re defining implementation details in markup, cut it out. Markup is for layout, not unit-testable business logic or state logic. If you depend on definitions in markup to drive your application’s behavior, you’re relying exorbitantly on a third-party framework, which is always extremely brittle from a testability perspective.
  5. To fix Smart UI, you just have to factor toward a more decoupled architecture. Start pulling different concerns out of the user controls and forms and finding a home for them.

By

Born to Exclude: Beware of Monoculture

Understanding the Idea of Monoculture

One morning last week, I was catching up on backlogged podcasts in my Doggcatcher feed on the way to work and was listening to an episode of Hanselminutes where Scott Hanselman interviewed a front end web developer named Garann Means. The subject of the talk was what they described as “developer monoculture,” and they talked about the potentially off-putting effect on would-be developers of asking them to fill a sort of predefined, canned “geek role.”

In other words, it seems as though there has come to be a standard set of “developer things” that developers are expected to embrace: Star Trek, a love of bacon (apparently), etc. Developers can identify one another in this fashion and share some common ground, in a “talk about the weather around the water cooler” kind of way. But this policy seems to go beyond simply inferring a common set of interests and into the realm of considering those interests table stakes for a seat at the geek table. In other words, developers like fantasy/sci-fi, bacon, that Big Bang Theory show (I tried to watch this once or twice and found it insufferable), etc. If you’re a real developer, you’ll like those things too. This is the concept of monoculture.

The podcast had weightier issues to discuss that the simple “you can be a developer without liking Star Trek,” though. It discussed how the expected conformance to some kind of developer archetype might deter people who didn’t share those interests from joining the developer community. People might even suppress interests that are wildly disparate from what’s normally accepted among developers. Also mentioned was that smaller developer communities such as Ruby or .NET trend even more toward their own more specific monoculture. I enjoyed this discussion in sort of an abstract way. Group dynamics and motivation is at least passingly interesting to me, and I do seem to be expected to do or like some weird things simply because I write software for a living.

TypicalCoder

At one point in the discussion, however, my ears perked up as they started to discuss what one might consider, perhaps, a darker side of monoculture–that it can be deliberately instead of accidentally exclusionary. That is, perhaps there’s a point where you go from liking and including people because you both like Star Trek to disliking and excluding people because they don’t. And that, in turn, leads to a velvet rope situation: not only are outsiders excluded, but even those with similar interest are initially excluded until they prove their mettle–hazing, if you will. Garann pointed out this dynamic, and I thought it was insightful (though not specific to developers).

From there, they talked about this velvet-roping existing as a result of people within the inner sanctum feeling that they had some kind of ‘birthright’ to be there, and this is where I departed in what had just been general, passive agreement with the points being made in the podcast. To me, this characterization was inverted–clubhouse sitters don’t exclude and haze people because they of a tribal notion that they were born into an “us” and the outsiders are “them.” They do it out of insecurity, to inflate the value of their own experiences and choices in life.

A Brush with Weird Monoculture and what it Taught Me

When I first met my girlfriend, she was a bartender. As we started dating, I would go to the bar where she worked and sit to have a beer, watch whatever Chicago sports team was playing at the time, and keep her company when it was slow. After a while, I noticed that there was a crowd of bar flies that I’d see regularly. Since we were occupying the same space, I made a few half-hearted efforts to be social and friendly with them and was rebuffed (almost to my relief). The problem was, I quickly learned, that I hadn’t logged enough hours or beers or something to be in the inner circle. I don’t think that this is because these guys felt they had a birthright to something. I think it’s because they wanted all of the beers they’d slammed in that bar over the course of decades to count toward something besides cirrhosis. If they excluded newbies, youngsters, and non-serious drinkers, it proved that the things they’d done to be included were worth doing.

So why would Star-Trek-loving geeks exclude people that don’t know about Star Trek? Well, because they can, with safety in numbers. Also because it makes the things that they enjoy and for which they had likely been razzed in their younger days an asset to them when all was said and done. Scott talked about being good at programming as synonymous with revenge–discovering a superpower ala Spiderman and realizing that the tables had turned. I think it’s more a matter of finding a group that places a radically different value on the same old things that you’ve always liked doing and enjoying that fact. It used to be that your skills and interests were worthless while those of other people had value, but suddenly you have enough compatriots to reposition the velvet rope more favorably and to demonstrate that there was some meaning behind it all. Those games of Dungeons and Dragons all through high school may have been the reason you didn’t date until twenty, but they’re also the reason that you made millions at that startup you founded with a few like-minded people. That’s not a matter of birthright. It’s a matter of desperately assigning meaning to the story of your life.

Perhaps I’ve humanized mindless monoculture a bit here. I hope so, because it’s essentially human. We’re tribal and exclusionary creatures, left to our baser natures, and we’re trying to overcome that cerebrally. But while it may be a sympathetic position, it isn’t a favorable or helpful one. We can do better. I think that there are two basic kinds of monoculture: inclusive, weather-conversation-like inanity monoculture (“hey, everyone loves bacon, right?!?”) and toxic, exclusionary self-promotion. In the case of the former, people are just awkwardly trying to be friendly. I’d consider this relatively harmless, except for the fact that it may inadvertently make people uncomfortable here and there.

The latter kind of monoculture dovetails heavily into the kind of attitudes I’ve talked about in my Expert Beginner series of posts, where worth is entirely identity-based and artificial. I suppose I perked up so much at this podcast because the idea of putting your energy into justifying why you’ve done enough, rather than doing more, is fundamental to the yet-unpublished conclusion of that series of posts. If you find yourself excluding, deriding, hazing, or demanding dues-paying of the new guy, ask yourself why. I mean really, critically ask yourself. I bet it has everything to do with you (“I went through it too,” and, “it wouldn’t be fair for someone just to walk in and be on par with me”) and nothing to do with him. Be careful of this, as it’s the calling card of small-minded mediocrity.

By

Introduction to Unit Testing Part 3: Unit Testing Sucks

I don’t know about you, but I remember desperately wanting to be able to drive right up until I was fifteen years old and I got my learner’s permit. I thought about it a lot–how fun it would be, how much freedom I would have, how my trusty old bike would probably get rusty from disuse. About a month after getting my permit, I desperately wanted my license and to drive on my own without supervision. But I’m omitting a month there, during which an unexpected thing happened. I realized that driving was stupid and awful and it sucked and I hated it and I’d never do it, so just forget it!

It was in that month that the abstraction of operating a car and having freedom became the reality of hitting the gas when I meant to hit the brake pulling out of my driveway or not knowing when I was supposed to go after stopping at a stop sign. It was a weird mix of frustration, anger, and fear that tends to accompany new activities–even ones that you know will benefit you. And that’s why the title of this post isn’t simple link bait. I did that not to satirize a position, but to empathize. Like many things when you’re new to them, starting to unit testing quite frankly sucks. It’s frustrating, foreign, and hard to get right. Accordingly, it’s easy to abandon it when you have deadlines to meet.

Accident

This post is about minimizing frustration and barriers to adoption by staying focused and setting reasonable expectations. I would argue that if you’re new to writing tests, writing a few and enjoying localized success without high coverage is a lot more important than suddenly becoming a TDD (Test-Driven Development) expert with 100% test coverage right out of the gate (or at least trying to become one). Incremental progress is good.

Don’t Try TDD Just Yet

I’m a little torn as I write this, but the first thing that I’ll suggest is that you not try TDD if you have no experience unit testing. Some might disagree with this suggestion, but I think that you’re going to be trying to learn too many new things all at once and will be a lot more likely to get frustrated. Unit tests are simply pieces of code that you write, as covered in more detail in the last post in the series. It’s a new kind of code to be writing, but you’re just learning about new methods to call and attributes (or annotations, in Java) to use. You’ll get there.

But TDD is an entirely new way of writing code. It’s a discipline in which you do not write any production code until you have written a unit test that fails. Then you get that test and all other tests to pass and refactor the code as needed. Does that sound crazy (if you discount the fact that a number of developers you respect probably do it)? Exactly. Probably not for you right now. It’s a bridge too far, and you’re more likely to throw up your arms in disgust and quit if you try to learn both things right now. I speak from experience, as, years ago, I was introduced to unit testing and TDD at the same time. I was overwhelmed until I just went back to figuring out the whole unit testing thing alone first. Maybe that wouldn’t happen to you, but I’d caution you to be wary of learning these two things simultaneously.

So let’s stick to learning what unit tests are and how to write them.

Test New Classes Only

In my pluralsight course, I use the example of a method that identifies numbers as prime or not, and in a series of posts I did last fall on TDD, I use the example of something that calculates a bowling score. I’ve also done other code katas and exercises like these in the past to show people both the mechanics of unit testing and TDD.

When I do this, one of the things people frequently say is something along the lines of “pff…sure, when you’re writing something stupid and easy like a prime number finder, but there’s no way that would work on our code base.” I then surprise these people by agreeing with them. I’m sure it wouldn’t work on your code base. Why? Well, because unit tests don’t just magically spring up like mushrooms after a few days of rain. They’re more like roses–you have to plan for them from the start and carefully cultivate an environment in which they can thrive.

Some years back, I saw an excellent talk on “The Deep Synergy Between Testability and Good Design,” by Michael Feathers. I highly suggest watching this talk if you haven’t seen it, but to summarize, he states (and I agree) that well-designed and factored code goes hand in hand with testability. You’re much more likely to find that code written to be testable is good code and, conversely, code written without unit tests in mind is not the greatest. And so if you’re deeply invested in a code base that has never been covered by unit tests, it doesn’t surprise me to hear that you don’t think unit testing would work on your code. I imagine it wouldn’t.

But don’t throw out unit testing because it looks like it wouldn’t work in your code base. Just resolve to do it on new classes that you create. As you go along and get better at unit testing, you’ll start to understand how to write testable classes. It will thus get easier and easier to test all new additions to the code, and you’ll start to get the hang of it with relatively minimal impact on your existing code, your process, or your time. Starting to unit test doesn’t mean that you’re suddenly responsible for testing every line of code in history, nor does it mean you must test every single new line. Just start out by writing a few that you think will help.

Test Existing Code by Extracting Little Classes

Once you get the feel for adding unit tests for new classes/code that you add to the code base, it’s a good time to start taking baby steps toward getting tests in place for your legacy (non-tested) code. Now, some procedural, monolithic mass of code that wasn’t testable a month ago when you started out isn’t magically testable now because you have some practice. It’s still a problem.

You’re going to have to chip away at it. And you’re going to have to do this by developing a new skill: identifying pieces of functionality that you can pull out into new classes and test. Go look through methods and classes and find things that don’t have a lot of dependencies on class fields or (yuck) global/static variables. Excellent candidates for this are methods with pure in-memory operations and ones that deal largely with primitives. Do you have some gigantic method that has a whole region buried in it that does nothing but cobble together a string to be used later in the method? Pull that out into a new class, and write unit tests that make assertions about the string it returns.

As you practice this, you’ll get a better and better feel for what you can pull out with a minimum of friction. You’ll find yourself not only getting more of your codebase under test, but also that you’re improving its design and modularity.

Know When to Fold ‘Em

This is another one that’s hard to type, but you really have to learn to look at code and just say, “nope, not happening.” There are classes and methods that you simply are not going to be able to test unless you come back with a green belt in unit testing–or pair with someone who has hers. And, even then, the prognosis may be that you need to rewrite the legacy class/method altogether to make it testable. Here is a quick list of things that, early in your unit-testing career, you should consider to be deal-breakers and simply move on from to avoid frustration. As a beginner, avoid testing code (class methods and properties) that:

  1. Calls static methods. At best, a static method is functional and returns something that depends only on its inputs. If this is the case (such at functions like Math.Pow() or Math.Abs()), the code is still testable, but a far more common case, especially if the static methods are ones in your own code base, is that they manipulate some kind of global state. Global state is testability kryptonite. I’ll explain more later, but for now, please take my word for it.
  2. Invokes singletons. The singleton design pattern is used almost universally as a politically correct way to hide your global variables in plain sight. For what this means to testability, see the last bullet. If it calls singletons, forget it, move on.
  3. Dispatches background workers or manages threading. When unit tests are run, the unit test runner is responsible for managing threading and it will run your tests in parallel. If you’re trying to make sure your threads and thread management are in one state for production and another for testing, you are about to ruin your day and probably your week. It’s not worth it–don’t try.
  4. Accesses files, connects to databases, calls web services, etc. I mentioned this in the first post in the series, but that was in the context of saying that these things aren’t considered unit tests. Well, another issue here is that they’re also relatively brittle and long running. If you write tests that do these things, they’re going to fail at weird times and in unpredictable ways. You’ll be used to all of your tests passing and suddenly one fails and then passes again, and it turns out it’s because Bill from accounting bumped into the database server and its Nic card is a little “tricky.” If you have unit tests that fail for borderline-inconceivable reasons beyond your control, you will become discouraged.
  5. Code that triggers any of the above anywhere in the call stack. You don’t escape the problems of threading, global state, or externalities by not using them directly. If you trigger them, it’s the same difference.
  6. Classes that require crazy amounts of instantiation. If you want to test a method, but it has forty-five parameters, most of which are classes that are difficult or complex to create, forget it. That code sorely needs reworking, and creating massive, brittle tests for it this early in your career will be a world of pain. Chip away at making the design better before you tackle it.

Don’t worry–I’m not suggesting that you give up on a long timeline, and I’ll continue on with this series and discuss strategy for addressing these things later. But for now, just consider them signals that this code is out of bounds for testing. If you don’t, there’s a high likelihood that you’ll spin your wheels and get angry, frustrated, and irritable, making it more likely that you’ll give up. I can’t eliminate the frustration of being new at something like driving, but I can at least steer you away from six-way traffic lights and three-lane roundabouts.

By

Don’t Write Code You Don’t Need

I was reviewing some code the other day, and I saw a quick-and-dirty logger implementation that looked something like this (I’m re-creating from memory and modifying a bit for illustrative purposes in this post):

A few things probably jump out at you, depending on your C#, OOP, and code reviewing chops. I’d imagine that, at the very least, you think, “you could just delete the default constructor.” You might also wonder why the other constructor exists since you could just set Path whenever you wanted. There’s the lack of null/valid check on path before creating a stream writer with it, and the unnecessary, potentially problematic, and definitely not threadsafe creation of the stream writer over and over. These things are all valid to point out, but I’d say that they’re also all symptoms of two larger root causes that I want to talk about.

Overeager to Please

TooMuchCode

You can offer too much code. By this I mean something subtly but critically different from “you can write too much code,” as when you are needlessly complicated or verbose. Offering too much code means that you’re giving users of the public interface of your classes too many options. Before my TDD days, this was something with which I constantly struggled. To understand what I mean, consider the thinking that probably went into the creation of this class:

Well, let’s see. A file needs a path, so a file logger should probably also take a path as input. From there, it should handle the details of streams and all that other stuff so that all we have to do is give it stuff to put in the file.

So far, so good. This is pretty clever as far as abstractions goes, and whoever wrote this class has a good grasp on how to create abstractions that provide value. But here’s what probably came next.

So, for path, let’s make that a public property in case the user of the class wants to change the path later. For convenience, let’s add a constructor that lets the user specify the path, but let’s also let him know that he doesn’t have to.

That thinking is considerate and helpful, but it’s offering too much code. You need to take ownership of your abstractions and be a little forceful and unyielding with them. You provide what you provide, and if users don’t like your class, they can create an inheritor or write their own or something: “this is my class, take it or leave it.”

In this case, you have to make concrete decisions about how and when the path of the logger is set:

In this class, the path is set at instantiation time and only instantiation time. As long as you live with my logger, you will obey my rules!

Notice that we’ve already eliminated the problem of the pointless constructor by making this decision. We’re also well poised to handle the null/valid checking of the path since we can just do it in the constructor instead of doing it in the constructor and in the setter for path and worrying about duplication, when exactly to validate and how, what to do on invalid set when the last path was valid, etc. It’s also going to be very easy to take care of the issue with the needless creation of StreamWriter instances. Now that you can’t change the path once the instance is created, you can simply create the writer in the constructor and reuse it for the lifetime of the object by storing it as a private field. Now making this threadsafe becomes a pretty manageable task. In general, eliminating one conceptual option eliminates a lot of internal implementation complexity.

But what about the users who want to set the path at some point later in the object instance’s lifetime? Psst…that’s not a real common use case. And you know what? If they really want to do that, they can just instantiate a second logger. Don’t make your life really complicated because you think users of your code might want flexibility. Because you know what they want more than extra flexibility? Code that works. And it’s a lot harder to offer them code that works if you’re bending over backwards to handle every conceivable thing they might want to do.

Pointless or Speculative Mutability

The last section touches on this obliquely, but I’d like to bring this to the fore. Mutable, state-based code is a lot more prone to problems than immutable or functional code that has fixed or no state, respectively. Consider the refactoring path I’ve proposed here. In the last section, I chose to eliminate the public setter for Path instead of the constructor that took Path as an argument. Did that seem like a coin flip and pick one to you? Well, it wasn’t.

With the constructor injection of path, we have to do validity checking and initialization only once, to establish preconditions for the object’s existence. With the public setter for path, we have to maintain object invariants, which tends to be more complex. A good example of this is how we handle the transition from a valid path to the user setting an invalid or null path. Do we throw an exception? Revert to the previous path? You don’t have a discussion when there was no valid previous path as is the case with constructor injection.

This is just an example of a broader idea, which is that mutability creates state transition management tasks and that these tasks are harder to get right. To understand what I mean, imagine that you’re implementing the API for an array. You tell your users, “alright, when you create the array, you specify how many elements it will have, and then you can set and access the elements as you please by index.” The users come back and say, “well, we want to change the size of the array on the fly,” to which you respond, “oh, crap,” as you start to wonder what you’ll do if they want to resize it smaller than what they’ve used or if they try to make it negative or too big or something. You can feel the number of edge cases multiplying.

If you come from a background where you write a lot of procedural, script-based, or hacked-together utility code, this may seem like a weird thing to think about. You’re probably used to doing things like declaring boolean ‘flags’ somewhere in a method and setting them much later in that same method in order to keep track of something further down the line. And if you do this, you’re probably used to a lot of tweaking, guessing, and hoping for the best. But as it turns out, you’re doing things the hard way.

The easy way is to program without any state at all. That means no assignment and no variables, the way you might do if asked to write a method that took x and y and returned 4x + 2y. There’s no need to do any of that: just have “return 4x + 2y” and be done with it. This way of doing things is called functional programming, and its calling card is that output is always a pure function of the input and is entirely predictable.

If functional paradigm isn’t an option (and it often isn’t in the entirety of an application), the next easiest thing to deal with is immutable objects. These are objects that have state, but that state is not modifiable after creation time. An example of an object that can easily be immutable is something like an address. An address is just a handful of strings bound together with a class definition, so why have it be mutable? If you want a different one, just let the one you have go out of scope and create a new one. Then you don’t have to worry about questions like “what if I set a new address1, but not a new address2–that would never happen, right?” In immutable land, that’s simply not a consideration that you bother with.

The hardest way of doing things is with state transition management or mutable state. That’s the nature of the original logger at the top. Anything goes. The object has only transitive state, so it has either to constantly manage all permutations of state changes to check for accuracy or it has to risk being wrong in an invalid state. This is not a great choice and should be avoided if at all possible.

As you develop, you’ll find it’s possible a lot more often than you think. Objects that perform a service, such as a logger, generally have little reason to store mutable state, particularly in a nicely decoupled application. Even a lot of data objects, which represent the very core of what we’d think of as mutable, can be immutable a lot more often than you’d think (e.g. address). Mutability in your code is often best left in places where it’s unavoidable. If you don’t have the luxury of pass-through interaction with a database, you will generally maintain an in-memory domain model that needs to have mutable state because it’s representing the database which is, almost by definition, a whole gigantic system of files full of mutable state. A lot of objects that help you manage user interface will have mutable state (e.g. some class that stores things like “is checkbox X currently checked”). But unavoidable as it may be, you can certainly minimize and isolate it. Whatever you do, don’t introduce it where you don’t need it because you’re creating needless complexity, which means extra things that can go wrong.

Take-Away

The upshot of this exhaustive code-review that I’ve conducted is just advice to consider carefully where your classes fall on the functional-immutable-mutable spectrum and to keep them as far toward the functional side as possible. I’m not suggesting that you run out and rewrite existing code this minute or even that you vow to change how you do things. I’m just suggesting you be aware that mutable objects come with a heavy cost in terms of complexity and difficulty. This will help you in your programming travels in general, and especially if you’re looking to delve into things like architecture, test-driven development, or concurrent (multi-threaded) programming.

By

Introduction to Unit Testing Part 2: Let’s Write a Test

In the last post in this series, I covered the essentials of unit testing without diving into the topic in any real detail. The idea was to offer sort of a guerrilla guide to what unit testing is for those who don’t know but feel they should. Continuing on that path and generating material for my upcoming presentation, I’m going to continue with the introduction.

In the previous post, I talked about what unit testing is and isn’t, what its actual purpose is, and what some best practices are. This is like explaining the Grand Canyon to someone that isn’t familiar with it. You now know enough to say that it’s a big hole in the earth that provides some spectacular views and that you can hike down into it, seeing incredible shifts in flora and fauna as you go. You can probably convince someone you’ve been there in a casual conversation, even without having seen it. But, you won’t really get it until you’re standing there, speechless. With unit testing, you won’t really get it until you do it and get some benefit out of it.

So, Let’s Write that Test

Let’s say that we have a class called PrimeFinder. (Anyone who has watched my Pluralsight course will probably recognize that I’m recycling the example class from there.) This class’s job is to determine whether or not numbers are prime, and it looks like this:

public class PrimeFinder
{
    public bool IsPrime(int possiblePrime)
    {
        return possiblePrime != 1 && !Enumerable.Range(2, (int)Math.Sqrt(possiblePrime) - 1).Any(i => possiblePrime % i == 0);
    }       
}

Wow, that’s pretty dense looking code. If we take the method at face value, it should tell us whether a number is prime or not. Do we believe the method? Do we have any way of knowing that it works reliably, apart from running an entire application, finding the part that uses it, and poking at it to see if anything blows up? Probably not, if this is your code and you needed my last post to understand what a unit test was. But this is a pretty simple method in a pretty simple class. Doesn’t it seem like there ought to be a way to make sure it works?

I know what you’re thinking. You have a scratchpad and you copy and paste code into it when you want to experiment and see how things work. Fine and good, but that’s a throw-away effort that means nothing. It might even mislead when your production code starts changing. And checking might not be possible if you have a lot of dependencies that come along for the ride.

But never fear. We can write a unit test. Now, you aren’t going to write a unit test just anywhere. In Visual Studio, what you want to do is create a unit test project and have it refer to your code. So if the PrimeFinder class is in a project called Daedtech.Production, you would create a new unit test project called DaedTech.Production.Test and add a project reference to Daedtech.Production. (In Java, the convention isn’t quite as cut and dry, but I’m going to stick with .NET since that’s my audience for my talk). You want to keep your tests out of your production code so that you can deploy without also deploying a bunch of unit test code.

Once the test class is in place, you write something like this, keeping in mind the “Arrange, Act, Assert” paradigm described in my previous post:

[TestMethod]
public void Returns_False_For_One()
{
    var primeFinder = new PrimeFinder(); //Arrange

    bool result = primeFinder.IsPrime(1); //Act

    Assert.IsFalse(result); //Assert
}

The TestMethod attribute is something that I described in my last post. This tells the test runner that the method should be executed as a unit test. The rest of the method is pretty straightforward. The arranging is just declaring an instance of the class under test (commonly abbreviated CUT). Sometimes this will be multiple statements if your CUTs are more complex and require state manipulation prior to what you’re testing. The acting is where we test to see what the method returns when we pass it a value of 1. The asserting is the Assert.IsFalse() line where we instruct the unit test runner that a value of false for result means the test should pass, but true means that it should fail since 1 is not prime.

Now, we can run this unit test and see what happens. If it passes, that means that it’s working correctly, at least for the case of 1. Maybe once we’re convinced of that, we can write a series of unit tests for a smattering of other cases in order to convince ourselves that this code works. And here’s the best part: when you’re done exploring the code with your unit tests to see what it does and convince yourself that it works (or perhaps you find a bug during your testing and fix the code), you can check the unit tests into source control and run them whenever you want to make sure the class is still working.

Why would you do that? Well, might be that you or someone else later starts playing around with the implementation of IsPrime(). Maybe you want to make it faster. Maybe you realize it doesn’t handle negative numbers properly and aim to correct it. Maybe you realize that method is written in a way that’s clear as mud and you want to refactor toward readability. Whatever the case may be, you now have a safety net. No matter what happens, 1 will never be prime, so the unit test above will be good for as long as your production code is around–and longer. With this test, you’ve not only verified that your production code works now; you’ve also set the stage for making sure it works later.

Resist the Urge to Write Kitchen Sink Tests

kitchensink

When I talked about a smattering of tests, I bet you had an idea. I bet your idea was this:

[TestMethod]
public void Test_A_Bunch_Of_Primes()
{
    var primes = new List<int>() { 2, 3, 5, 7, 11, 13, 17 };
    var primeFinder = new PrimeFinder();


    foreach(var prime in primes)
        Assert.IsTrue(primeFinder.IsPrime(prime));
}

After all, it’s wasteful and inefficient to write a method for each case that you want to test when you could write a loop and iterate more succinctly. It’ll run faster, and it’s more concise from a coding perspective. It has every advantage that you’ve learned about in your programming career. This must be good. Right?

Well, not so much, counterintuitive as that may seem. In the first place, when you’re running a bunch of unit tests, you’re generally going to see their result in a unit test runner grid that looks something like a spreadsheet. Or perhaps you’ll see it in a report. If when you’re looking at that, you see a failure next to “IsPrime_Returns_False_For_12″ then you immediately know, at a glance, that something went wrong for the case of 12. If, instead, you see a failure for “Test_A_Bunch_Of_Primes”, you have no idea what happened without further investigation. Another problem with the looping approach is that you’re masking potential failures. In the method above, what information do you get if the method is wrong for both 2 and 17? Well, you just know that it failed for something. So you step through in the debugger, see that it failed for 2, fix that, and move on. But then you wind up right back there because there were actually two failures, though only one was being reported.

Unit test code is different from regular code in that you’re valuing clarity and the capture of intent and requirements over brevity and compactness. As you get comfortable with unit tests, you’ll start to give them titles that describe correct functionality of the system and you’ll use them as kind of a checklist for getting code right. It’s like your to-do list. If every box is checked, you feel pretty good. And you can put checkboxes next to statements like “Throws_Exception_When_Passed_A_Null_Value” but not next to “Test_Null”.

There are some very common things that new unit testers tend to do for a while before things click. Naming test methods things like “Test_01″ and having dozens of asserts in them is very high on the list. This comes from heavily procedural thinking. You’ll need to break out of that to realize the benefit of unit testing, which is inherently modular because it requires a complete breakdown into components to work. If nothing else, remember that it’s, “Arrange, Act, Assert,” not, “Arrange, Act, Assert, Act, Assert, Assert, Assert, Act, Act, Assert, Assert, Act, Assert, etc.”

Wrap-Up

The gist of this installment is that unit tests can be used to explore a system, understand what it does, and then guard to make sure the system continues to work as expected even when you are others are in it making changes later. This helps prevent unexpected side effects during later modification (i.e. regressions). We’ve also covered that unit tests are generally small, simple and focused, following the “Arrange, Act, Assert” pattern. No one unit test is expected to cover much ground–that’s why you build a large suite of them.

By

It’s a Work in Progress

I’ll have that for you next week. Oh, you want it to do that other thing that we talked about earlier where it hides those controls depending on what the user selects? I’ll have it for you next month. Oh, and you also want the new skinning and the performance improvements too? I’ll get started and we’ll set about six months out as the target date. Unless you want to migrate over to using Postgres. Then make it a year.

Have you ever had any conversations that went like this? Ever been the person making this promise or at least the one that would ultimately be responsible for delivering it? If so, I bet you felt like, “man, I’m totally just making things up, but I guess we’ll worry about that later.” Have you ever been on the hearing end of this kind of thing? I bet you felt like, “man, he’s totally just making things up, and none of this will ever actually happen.”

As you’ll know if you’ve been checking in at this blog for a while, my opinions about the “waterfall methodology” are no secret. It isn’t my intention to harp on that here but rather to point out that the so-called waterfall approach to software development is simply part of what I consider to be a larger fail: namely, putting a lot of distance between promises and deliveries.

I get the desire to do this–I really do. It’s the same reason that we root for underdogs in sports or we want the nice guy not to finish last. Having success is good, but sudden, unexpected success is awesome. It creates a narrative that is far more compelling than “guy who always delivers satisfactory results does it yet again.” Instead, it says, “I just pulled this thing called an iPod out of nowhere and now I’m going to be the subject of posthumous books and cheesy project manager motivational calendars in a decade.”

Companies have done things like this for a long, long time–particularly companies that sell tangible goods. It’s been the story of a consumer-based world where you innovate, patent, and then sell temporarily monopolized widgets, making a fortune by giving the world something new and bold. You probably wind up on the cover of various magazines with feel-good stories about the guy behind the revolutionary thing no one saw coming, except for one visionary (or a team of them) toiling away in anonymity and secrecy, ready to pull back the curtain and dazzle us with The Prestige.

Magician

But here’s the thing. The world is getting increasingly service oriented. You don’t really buy software anymore because now you subscribe and download it from “the cloud” or something. Increasingly, you rent tools and other things that perhaps you would previously have purchased. Even hardware goods and electronics have become relatively disposable, and there is the expectation of constant re-selling. Even the much-ballyhooed innovator, Apple, hasn’t really done anything interesting for a while. We’re in a world where value is delivered constantly and on a slight incline, rather than in sudden, massive eruptions and subsequent periods of resting on laurels.

What does the shifting global economic model have to do with the “waterfall” method of getting things done? Well, it pokes a hole in the pipe-dream of “we’ll go off and bury ourselves in a bunker for six months and then emerge with the software that does everything you want and also makes cold fusion a reality.” You won’t actually do that because no one really does anymore, and even back when people did, they failed a lot more often than they succeeded in this approach.

I love to hear people say, “it’s a work in progress.” That means that work is getting done and those doing it aren’t satisfied, and those two components are essential to the service model. These words mean that the speaker is constantly adding value. But more than that, it means that the speaker is setting small, attainable goals and that those listening can expect improvements soon and with a high degree of confidence. The person doing the work and the target audience are not going to get it perfect this week, next week, next month, or maybe ever. But they will get it better and better at each of those intervals, making everyone involved more and more satisfied. And having satisfaction steadily improve sure beats having no satisfaction at all for a long time and then a very large chance of anger and lawsuits. Make your work a work in progress, and make people’s stock in your labor a blue chipper rather than a junk bond.

By

Introduction to Unit Testing (Don’t Worry, Your Secret is Safe with Me)

Have you ever been introduced to someone and promptly forgotten their name? Or have you ever worked with or seen someone enough socially that you ought to know their name? They know yours, and you’re no longer in a position that it’s socially acceptable for you to ask them theirs. You’re stuck, so now you’re sentenced to an indefinite period of either avoiding calling them by name, avoiding them altogether, or awkwardly trying to get someone else to say their name. Through little to no fault of your own, you’re going to be punished socially for an indefinite period of time.

I think the feeling described strongly parallels the feeling that a developer has when the subject of unit testing comes up. Maybe it’s broached in an interview by the interviewer or interviewee. Perhaps it comes up when a new team member arrives or when you arrive as the new team member. Maybe it’s at a conference over drinks or at dinner. Someone asks what you do for unit testing, and you scramble to hide your dirty little secret–you don’t have the faintest idea how it works. I say that it’s a dirty little secret because, these days, it’s generally industry-settled case law that unit testing is what the big boys do, so you need either to be doing it or have a reason that you’re not. Unless you come up with an awesome lie.

The reasons for not doing it vary. “We’re trying to get into it.” “I used to do it, but it’s been a while.” “Oh, well, sometimes I do, but sometimes I don’t. It’s hard when you’re in the {insert industry here} industry.” “We have our own way of testing.” “We’re going to do that starting next month.” It’s kind of like when dentists ask if you’re flossing. But here’s the thing–like flossing, (almost) nobody really debates the merits of the practice. They just make excuses for not doing it.

I’m not here to be your dentist and scold you for not flossing. Instead, I’m here to be your buddy: the buddy with you at the party that knows you don’t know that guy’s name and is going to help you figure it out. I’m going to provide you with a survival guide to getting started with unit testing–the bare essentials. It is my hope that you’ll take this and get started following an excellent practice. But if you don’t, this will at least help you fake it a lot better at interviews, conferences, and other discussions.

(If you’re a grizzled unit-testing veteran this will be review for you, though you can feel free to read through and critique the finer points)

Let’s Get our Terminology Straight

One common pitfall for the unit-test-savvy faker is misuse of the term. Nothing is a faster giveaway that you’re faking it than saying that you unit test and proceeding to describe something you do that isn’t unit testing. This is akin to claiming to know the mystery person’s name and then getting it wrong. Those present may correct you or they may simply let you embarrass yourself thoroughly.

Unit testing is testing of units in your code base–specifically, the most granular units or the building blocks of your application. In theory, if you write massive, procedural constructs, the minimum unit of code could be a module. But in reality, the unit in question is a class or method. You can argue semantics about the nature of a unit, but this is what people who know what they’re talking about in the industry mean by unit test. It’s a test for a class and most likely a specific method on that class.

Here are some examples of things that are not unit tests and some good ways to spot fakers very quickly:

  1. Developer Testing. Compiling and running the application to see if it does what you want it to do is not unit testing. A lot of people, for some reason, refer to this as unit testing. You can now join the crowd of people laughing inwardly as they stridently get this wrong.
  2. Smoke/Quality Testing. Running an automated test that chugs through a laundry list of procedures involving large sections of your code base is not a unit test. This is the way the QA department does testing. Let them do their job and you stick to yours.
  3. Things that involve databases, files, web services, device drivers, or other things that are external to your code. These are called integration tests and they test a lot more than just your code. The fact that something is automated does not make it a unit test.

So what is a unit test? What actually happens? It’s so simple that you’ll think I’m kidding. And then you’ll think I’m an idiot. Really. It’s a mental leap to see the value of such an activity. Here it is:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Returns_False_For_1()
{
    var finder = new PrimeFinder();
    Assert.IsFalse(finder.IsPrime(1)); 
}

I instantiate a class that finds primes, and then I check to make sure that IsPrime(1) returns false, since 1 is not a prime number. That’s it. I don’t connect to any databases or write any files. There are no debug logs or anything like that. I don’t even iterate through all of the numbers, 1 through 100, asserting the correct thing for each of them. Two lines of code and one simple check. I am testing the smallest unit of code that I can find–a method on the class yields X output for Y input. This is a unit test.

I told you that you might think it’s obtuse. I’ll get to why it’s obtuse like a fox later. For now, let’s just understand what it is so that at our dinner party we’re at least not blurting out the wrong name, unprovoked.

What is the Purpose of Unit Testing?

Unit testing is testing, and testing a system is an activity to ensure quality. Right? Well, not so fast. Testing is, at its core, experimentation. We’re just so used to the hypothesis being that our code will work and the test confirming that to be the case (or proving that it isn’t, resulting in changes until it does) that we equate testing with its outcome–quality assurance. And we generally think of this occurring at the module or system level.

Testing at the unit level is about running experiments on classes and methods and capturing those experiments and their results via code. In this fashion, a sort of “state of the class” is constructed via the unit test suite for each unit-tested class. The test suite documents the behavior of the system at a very granular level and, in doing so, provides valuable feedback as to whether or not the class’s behavior is changing when changes are made to the system.

So unit testing is part of quality assurance, but it isn’t itself quality assurance, per se. Rather, it’s a way of documenting and locking in the behavior of the finest-grained units of your code–classes–in isolation. By understanding exactly how all of the smallest units of your code behave, it is straightforward to assemble them predictably into larger and larger components and thus construct a well designed system.

Some Basic Best Practices and Definitions

So now that you understand what unit testing is and why people do it, let’s look a little bit at some basic definitions and generally accepted practices surrounding unit tests. These are the sort of things you’d be expected to know if you claimed unit-testing experience.

  • Unit tests are just methods that are decorated with annotations (Java) or attributes (C#) to indicate to the unit test runner that they are unit tests. Each method is generally a test.
  • A unit test runner is simply a program that executes compiled test code and provides feedback on the results.
  • Tests will generally either pass or fail, but they might also be inconclusive or timeout, depending on the tooling that you use.
  • Unit tests (usually) have “Assert” statements in them. These statements are the backbone of unit tests and are the mechanism by which results are rendered. For instance, if there is some method in your unit test library Assert.IsTrue(bool x) and you pass it a true variable, the test will pass. A false variable will cause the test to fail.
  • The general anatomy of a unit test can be described using the mnemonic AAA, which stands for “Arrange, Act Assert.” What this means is that you start off the test by setting up the class instance in the state you want it (usually be instantiating it and then whatever else you want), executing your test, and then verifying that what happened is what you expected to happen.
  • There should be one assertion per test method the vast majority of the time, not multiple assertions. And there should (almost) always be an assertion.
  • Unit tests that do not assert will be considered passing, but they are also spurious since they don’t test anything. (There are exceptions to this, but that is an intermediate topic and in the early stages you’d be best served to pretend that there aren’t.)
  • Unit tests are simple and localized. They should not involve multi-threading, file I/O, database connections, web services, etc. If you’re writing these things, you’re not writing unit tests but rather integration tests.
  • Unit tests are fast. Your entire unit test suite should execute in seconds or quicker.
  • Unit tests run in isolation, and the order in which they’re run should not matter at all. If test one has to run before test two, you’re not writing unit tests.
  • Unit tests should not involve setting or depending on global application state such as public, static, stateful methods, singeltons, etc.

Lesson One Wrap-Up

I’m going to stop here and probably continue this as a series of posts. I’m going to be giving an “introduction to unit tests” talk, probably this week, so I’m doing double duty with blog posts and prepping for that talk. When finished, I’ll probably post the PowerPoint slides for anyone interested. Hopefully if you’ve been faking it all along with unit tests you can now at least fake it a little bit better by actually knowing what they are, how they work, why people write them, and how people use them. Perhaps together we can get you started adding some value by writing them, but at least for now this is a start.

By

Reverential Practice

During the 1950s, a then-unknown journalist named Hunter Thompson briefly held a job working for Time Magazine. During his tenure there, he used a typewriter to type the text, verbatim, of novels by Ernest Hemingway and F. Scott Fitzgerald in order to feel what it was like to write a novel in the literary voice of those great authors. Thompson would go on to write many articles and some books of his own, including the popular Fear and Loathing in Las Vegas, which was eventually made into a movie starring Johnny Depp and Benicio Del Toro. He is also the inventor of a wild, manic style of ‘reporting’ known as Gonzo Journalism in which the author is the story as much as he reports on the story.

HunterThompson

If you type out A Farewell to Arms or The Great Gatsby, I think it’s pretty unlikely that you’ll be the next Hemingway, Fitzgerald, or Thompson. I mean, no offense, but the odds are against that. But what you must have is an incredible amount of appreciation for your craft as an author and an intense desire to carefully study and learn the habits of success and ingenuity. And while you may not strike the big time, I have a hard time imagining that you’ll be worse off for doing so.

In our line of work as programmers, there’s a very important difference between writing the code for an application and writing a novel. A novel is intended to be utterly unique and creative, and it serves as its own end. A program is intended to get the job done and uniqueness is neither necessary nor particularly desirable. (It’s not undesirable–it just doesn’t matter). But in both cases, someone interested in getting inside the mind of success could do a full-on emulation of a renowned practitioner. As a programmer, you could do your best Hunter Thompson and bang out an early version of the Linux kernel, emulating Linus Torvalds and his success at crafting a product.

There are certainly barriers to doing this that don’t exist when it comes to novels. A sizable chunk of software is proprietary, so you can’t see the code to emulate it. The code, language, libraries and platform might be so hopelessly dated that finding hardware on which to run it proves interesting. And if you’re on the cusp of being fired for misappropriating company time, it probably isn’t advisable. But still, imagine what that would be like…

I call this “reverential practice,” though it isn’t practice in the true sense. It’s more of an homage. If you look up practice in the dictionary (the definition we’re using here anyway), you’ll see that it requires repetition for skill acquisition. This clearly isn’t that, since you’d be doing it only once. I say “reverential practice” as a nod and distinction from a term I like that pops up in software blogs and talks: “deliberate practice.” I like the term “deliberate practice” because it distinguishes between doing your job and honing your skills via drilling and focusing deliberately on areas that need improvement. (The term does skew a little heavy toward a borderline weird equation with software development as performance art, but who am I to begrudge people literary and dramatic devices?) And now, I like the term “reverential practice” because it invokes the same connotation of specific attempts to improve, but via unusual amounts of respect for the approach another has taken.

I’m not suggesting that you go out and do this, but it would certainly be an interesting experiment. If you did it, what would you learn from it? Can you do it on a more micro-level in which you just ask someone prominent in the field for a little application they’ve written and then write it yourself? Would it work with just the finished product, or would it be better to see a screencast of them coding and type along with them, extracting methods and deleting things that maybe weren’t right after all? And, most importantly, would it help you improve at all, basking in the reflected successes of another, or would it be a pure act of quasi-religious Programmer Work Ethic?

I don’t know, and there’s a pretty good chance that I’ll never know. But hopefully that’s some interesting food for thought on this Friday. Cheers!

By

Why Are Public Fields No Good, Again?

Today, a developer on my team, relatively new to C# and some of the finer points of OO, asked me to take a look at his code. I saw this:

How would you react to this in my shoes? I bet like I did–with a quick “ooh, no good…try this:”

Busy and pleased with myself for imparting this bit of wisdom, I started to move on. But then I realized that I was failing at mentoring. Nothing annoyed me more as a junior developer than hearing something like, “That’s just the way you do it.” And there I was saying it.

The creatures outside looked from pig to man, and from man to pig, and from pig to man again; but already it was impossible to say which was which.

AnimalFarm_1stEd

I stopped at this point and forced myself to talk about this–to come up with a reason. Obviously, I hadn’t been thinking about the subject of why public fields are icky, so slick terms like “encapsulation” and “breaking change” weren’t on the tip of my tongue. Instead, I found myself talking sort of obliquely and clumsily explaining encapsulation in terms of an example. Strangely, the first thing that popped into my head was threading and locking the field to make it thread-safe. So I said something along the lines of this:

With a public field, you simply have a variable that can be retrieved and set. With a property, you’re really hiding that variable behind a little method, and you have the ability to centralize operations on that variable as needed rather than forcing them on everyone who uses your code. So imagine that you want to make these fields thread-safe. In your version, you force all client code that uses your class to deal with this problem. In the property version, you can handle that for your clients so that they needn’t bother.

Yeah, things are always more awkward off the cuff. So I ruminated on this a bit and thought I’d offer it up in blog post format. Why not have public fields (aside from random thoughts about threading)?

The Basic Consideration: Encapsulation

As amused by my off-the-cuff threading example as I am, it does drive at a fundamental tenet of OOP, which is class-level encapsulation. If I’m writing a class, I want to expose a set of public functionality (methods and properties) that describe what an instance of the class will do while hiding how it will do it. If the address class is in some other assembly and I don’t decompile or anything, as a client, I have no idea if the City property is auto-implemented, if it wraps a simple property, or if all kinds of magic happens behind the scenes.

I know (hopefully) that if I set the City property to some value and then read it back, I get that value. But beyond that, I dunno. Maybe it implements lazy loading from some persistence store. Maybe it tracks all of the things you’ve set City to throughout its entire lifetime. I don’t know, and I don’t want to know if it isn’t part of the public API.

If you use public fields, you can’t offer consumers of your code that blissful ignorance. They know, ipso facto. And, worse, if they want any of the cool stuff I mentioned, like lazy loading or tracking previous values, they have to implement it themselves. When you don’t encapsulate, you fail to provide any kind of useful abstraction to me. If all you’re giving me is a field-bag, what use do I have for an instance of your class in mine? I can just have my own string variables, thank you very much, and I will hide them from my clients.

The Intermediate Consideration: Breaking Changes

There’s a more subtle problem at play here as well. Specifically, public properties and public fields look and seem pretty similar, if not identical, but they’re really not. Under the hood, a property is a method. A field is, well, a field. If you have a property and you decide you want to change its internal representation (going from something complex to simple property or automatic property), no harm done. If you want to switch between a field and a property though, even the trivial switch I showed above, there be dragons–especially going from field to property.

It seems like a pretty obvious case of YAGNI to start out with a field and move to a property if you need it, but things start to go wrong. Weird and subtle things. And they go wrong on you when you least expect them. Maybe you’re using the public members of the Address class as part of some kind of reflection. Well, now things are broken because properties and fields are completely different here. Perhaps you have a field that you were using as an out parameter somewhere. Oops. And, perhaps most insidiously of all, just changing a field to a property will cause runtime exceptions in dependent assemblies unless recompiled.

Let me put that another, scarier way. Let’s say that I write something called Erik’s Awesome DLL and I distribute it to to the world, where it gains wide adoption. Let’s also say that I created a bunch of public fields that the world uses as part of my DLL’s API. And finally, let’s say that in vNext, I decide that I want some encapsulation after all. Maybe I want to implement thread-safety. I implement and publish vNext to Nuget, and you update accordingly. The next time you hit F5, your application will blow up, and it will continue to blow up until you recompile. It may not sound like a big deal, but you’re definitely going to annoy users of your code with stuff like this.

So beware that you’re casting the die here more quickly and strongly than you might think. In most cases, YAGNI applies, but in this case, I’d say that you should assume you are going to need it. (And anything that stops people from using out parameters is good in my book)

The Advanced Consideration: Tell, Don’t Ask

I’ll leave off with a more controversial and definitely the most philosophical point. Here’s some interesting reading from the Pragmatic Bookshelf, and the quote I’m most interested in here describes the concept of “Tell, Don’t Ask.”

[You] should endeavor to tell objects what you want them to do; do not ask them questions about their state, make a decision, and then tell them what to do.

I’m calling this controversial because the context in which I’m mentioning it here is a bit strained. Address, above, is essentially a data transfer object whose only purpose is to store data. It will probably be bubbled up from somewhere like a database and make its way onto something like a form, with perhaps no actual plumbing code that ever does ask it anything. But then again, maybe not, so I’m mentioning it here.

Public fields are the epitome of “ask.” There’s no telling but not asking–they’re all “ask and tell.” Auto-properties are hardly different, but they’re a slight and important step in the right direction. Because with a property, awkward as it may be, we could get rid of the getter and leave only the setter. This would be a step toward factoring to a (less awkward) implementation in which we had a method that was setting something on the object, and suddenly we’re telling only and asking nothing.

“Tell, don’t ask” is really not about setting data properties and never reading them (which is inherently useless), but about a shift in thinking. So in this example, we’d probably need one more step to get from telling Address what its various fields are only to telling some other object to do something with that information. This is getting a little indirect, so I won’t belabor the point. But suffice it to say that code riddled with public fields tends to be about as far from “tell, don’t ask” as you can get. It’s a system design in which one piece of code sets things so that another can read it rather than a system design in which components communicate via instructions and commands.

So I feel clean and refreshed now. I didn’t shove off with “I’m in charge and I say so” or “because that’s just a best practice.” I took time to construct a carefully reasoned argument, which served double duty of keeping me in the practice of justifying my designs and decisions and staying sharp with my understanding of language and design principles.

Acknowledgements | Contact | About | Social Media