Stories about Software


Resumes and Alphabet Soup

I was reading a blog post by Scott Hanselman this morning. The main premise was a bemused lampooning of the ubiquitous tendency of techies to create a resume “alphabet soup” and a suggestion that Stack Overflow Careers is introducing a better alternative.

I was going to comment on this post inline, but I discovered that I had enough to say to create my own post about it. It seems as though this state of affairs for techies is, ironically, the product of living in a world created by techies. What I mean is that over the last several decades, we have made it our life’s goal to automate processes and improve efficiency. Those chickens have come home to roost in the form of automated resume searching, processing, and recommending.

Alphabet Soup Belongs in a Bowl

Having an item at the bottom of a resume that reads “XML, HTML, JSON, PHP, XSLT, HTTP, WPF, WCP, CSS… etc” is actually quite efficient toward its purpose in the same way that sticking a bunch of keywords in a web page’s hidden meta-data is efficient. It violates the spirit of the activity by virtue of the fact that it’s so good at gaming it as to be “too good”. As a job seeker, if I want to create the largest opportunity pool, it would stand to reason that I should include every three and four character combination of letters in existence somewhere in the text of my resume (“Strongly proficient in AAA, AAB, AAC, …ZZZZ”). And, while most of these ‘technologies’ don’t exist and I probably haven’t used most of the ones that do, this will cast a wider net for my search than not including this alphabet soup. I can always sort out the details later once my foot is in the door. Or, so the thinking probably goes (I’m not actually endorsing, in any way, resume exaggeration or the SPAMing the resume machine — just using first person to illustrate a point).

We, the techies of the world, have automated the process of matching employers with employees, and now, we are forced to live in a world where people attempt to game the system in order to get the edge. So, what’s the answer? A declaration that companies should stop hiring this way? Maybe, but that seems unlikely. A declaration that people should stop creating their resumes this way because it’s silly? That seems even more unlikely because the ‘silly’ thing is not doing something that works, and the alphabet soup SPAM works.

I think that this programmer-created problem ought to be solved with better programming. What we’ve got here is a simple text matching algorithm. As a hiring authority, I program my engine (or have the headhunters and sites program it for me) to do a procedure like “give me all the resumes that contain XML, HTML, CSS, and WPF”. I then get a stack of matching resumes from people whose experience in those technologies may range from “expert” to “I think I read about that once on some website” and it’s up to me to figure out which resume belongs to which experience level, generally via one or more interviews.

So, maybe the solution here is to create a search algorithm that does better work. If I were gathering requirements for such a thing, I would consider that I have two stakeholders: company and employee. These two stakeholders share a common goal, “finding a good match”, and also have some mildly conflicting goals — company wants lots of applicants and few other companies and employee wants lots of companies and few other prospective employees. It is also worth considering that the stakeholders may attempt to deceive one another in pursuit of the goal (resume padding on one end or misrepresenting the position on the other end).

With that (oversimplified) goal enumeration in mind, I see the following design goals:

  1. Accuracy for employers. The search engine returns candidates that are better than average at meeting needs.
  2. Accuracy for employees. Engine does not pair them with employers looking for something different than them, thus putting them in position to fail interviews and waste time.
  3. Easy to use, narrative inputs for employees. You type up a summary of your career, interests, experience, projects, etc, and that serves as input to the algorithm – you are not reduced to a series of acronyms.
  4. Easy to use, narrative inputs for employers. You type up a list of the job’s duties at present, anticipated duties in the future, career development path, success criteria, etc and that serves as input to the algorithm.
  5. Opacity/Double blind. Neither side understands transparently the results from its inputs. Putting the text “XML” on a resume has an indeterminate effect the likelihood of getting a job with an employer that wants employees with XML knowledge. This mitigates ‘gaming’ of the system (similar in concept to how search engines guard their algorithms)

Now, in between the narrative input of both sides, the magic happens and pairings are made. That is where we as the techies come in. This is an ambitious project and not an easy one, but I think it can and should happen. Prospective employees tell a story about their career and prospective employers tell a story about a hypothetical employee and matches are created. Nowhere in this does a dumb straight-match of acronym to acronym occur, though the system would take into account needs and skills (i.e a position primarily developing in C++ would yield candidates primarily with good C++ background).

Anyway, that’s just what occurred to me in considering the subject, and it’s clearly far too long for a blog comment. I’m spitballing this here, so comments and opinions are welcome. Also, if this already exists somewhere, please point me at it, as I’d be curious to see it.

(Alphabet soup photo is linked from this post which, by the way, I fully agree with.)



Quick Information/Overview

Pattern Type Creational
Applicable Language/Framework Agnostic OOP
Pattern Source Gang of Four
Difficulty Relatively easy.

Up Front Definitions

Here are terms that I’ll be using in this explanation and what they mean:

  1. Composite: This is another design pattern, but I’m using it here to mean an object that is composed of other objects.

The Problem

For this design pattern, I’ll use the example of modeling computers. A computer is a classic example of a composite object. That is, a computer object isn’t just a collection of simple literals, but is actually composed of other objects in your domain (most likely anyway, and it will be in the example).

So, let’s say that you get an assignment to model computer composition and pricing, perhaps for a website for a store that builds custom rigs. For a proof of concept, let’s say that you very much oversimplify things and decide to create an implementation where a computer has memory, a CPU, and a motherboard.

Now, since we’re modeling price rather than actual operation, it’s important to keep in mind that the abstraction for actual computer construction and sales is limited to “what will work with what” rather than modeling the actual interaction. So, if CPU X doesn’t work with motherboard Y, you don’t want to sell that combination. But, worrying about the details of how CPU x works with motherboard Y is beyond the scope of this application. Another thing to consider here is that all components have a price, and the assembled machine will have a price that equals the sum of its components, perhaps with a cost of assembly added as well.

Now, let’s also assume that there is a dependency relationship in ordering how machines are built. That is, you have an external requirement of some sort that the motherboard must be chosen first, followed by the CPU, and then by the memory. So, you create objects for your components and a computer object, and it looks something like the following (remember, this is an example and an example of a POC at that – let’s not get too immersed in the details)

And, finally, the computer:

And, life is good. You have a computer class that can have components added to it in the proper order, with each add returning false if you’ve made an oops. The compatibility standards are a little permissive at the moment, but, remember, POC.

So, the next thing that you’d do is actually set about building these computers, presumably based on user input. The user would first choose a MOBO, and you’d add it to an empty computer object. Then, the user would choose a CPU, and you’d flag an error message if trying to add that didn’t work, or add it if it did, and life would still be good.

But, let’s say that you want to add some functionality to let the user select some pre-assembled machines and then, perhaps customize them. So, you get the first spec in and, no problem, you add it, perhaps to the presentation layer class for backing the view of the computer building screen (remember, POC 🙂 ):

There you go – you’ve got a method in the class that builds a budget computer, pre-assembled for selection, allowing users to build their own or take your canned construction. Then, a requirement for another pre-configured one comes in, and you dutifully add it:

The two methods look pretty similar, but you’re doing a POC, so you choose not to sweat the small stuff, like “Don’t Repeat Yourself”. Now, each time you want to add new preconfigured computer, you have a design pattern. All you have to do is copy one of those methods and put different objects in the Add() calls.

Now, let’s fast forward a few months. After explaining that this was just a POC to management, management decides to ship the POC since it’s working so well. In the interim, you’ve gotten requirements to add hard drives, USB hubs, monitors, keyboards, mice, etc. You also now have to offer hundreds of pre-built computers to the users, along with the coolness of building their own.

Now, along comes an interesting new requirement. Instead of picking the MOBO first, the users now have to pick the CPU first. So, you switch the method preconditions for the AddCpu() and AddMotherboard() methods in Computer, run your unit tests/application, and discover that nothing works. D’oh! You’re still building them in the same order. You just need to switch the order of the client calls to AddMotherboard() and AddCpu(), and…. oh. You need to go through hundreds of methods doing this. Ouch…

So, What to Do?

At this point, “what to do” requires an examination of “what went wrong?” Everything here was a little slapdash, but relatively reasonably constructed. There was no obvious step where things got out of hand, and yet you suddenly have a mess. As it turns out, you have a mess that was completely avoidable.

Think back to the step where we copy/pasted some code and how that didn’t feel right. Well, turns out it wasn’t right (hint: it’s pretty much never right — Don’t. Repeat. Yourself. ). That little concession early in the game led to big problems later. We duplicated the logic of ordering the construction of the computer over and over again.

What could we have done instead? Well, builder pattern to the rescue. Let’s rewind to when we just had our computer and components, and our first method for a canned computer. Taking another look at that method, we see:

Looking at this a bit more abstractly, we have a method that accepts no parameters and returns a computer object. Hard-coded in there are various derived polymorphs that are specific to this particular construction. Also hard-coded in there is obeying the temporal coupling between the different add operations. So here, we’re doing two things — defining which components will be added, and defining the order in which they are added. Now, when we define our next implementation via copy/paste, we’re duplicating the ordering logic but giving different specific objects. What if, instead of coupling these two things, we introduced an abstraction:

Now, BuildComputer() just encapsulates the logic for navigating the temporal coupling. It’s up to somebody else which components it should be given. This is already a vast improvement and a solution to the problem that we created. Now, client code is making calls like:

Now, we’ve solved our main initial problem where the logic for the ordering of computer construction being strewn all over the place. With this code, we have one method to change: BuildComputer(), rather than hundreds.

But, we still have a subtle problem. Recall that we’d talked about now having additional components like monitors and hard drives, and we have hundreds of pre-canned combinations. Think about what BuildPreassembledComputers() is going to look like. It’s going to contain hundreds of permutations of parts, named only by the key in that dictionary. Every time you want to add a new pre-assembled computer, you’re going to have to open that class and add a line to that gargantuan method. If some of the computers become obsolete, you’re going to have to remove them from that method. That is, a the composition of a “budget” computer is going to change over the course of time as new parts come out and old ones become obsolete. Why should your presenter/controller/ViewModel have to change when that happens? At some point, you’re going to have to tie the particular computer to some hard-coded name, but it needn’t be there.

So, let’s extrapolate to something a little closer to the builder pattern.

Now, we’re onto something. The builder base class encapsulates a method where the temporal coupling is defined exactly once. The derived builders have the responsibility for defining exactly which components are part of the computer that they’ll return. And, the presentation layer class has the responsibility only for presenting things to the user. It knows what kind of computer it wants, and it delegates the construction totally – both the temporal coupling of the construction and the components used in assembly.

The derived classes have only the responsibility of determining which parts will be added to the composite, and the base class’s Assemble() method has the sole responsibility for navigating the assembly particulars. This solves the issue of modifying a method in a presentation class over and over again and an implied violation of the Open/Closed principle. Now, if we want to define a different building paradigm we can define a new class that derives from AlmostBuilder base class. Of course, somewhere or another we will need to add a line of creation code for that, but it won’t be here in the presentation layer, and we probably already have a creational pattern for that somewhere (like a factory).

Now, to get to the bonafide pattern, we define a “Director” class and different Builders expose methods like “BuildMemory()”, “BuildCpu()” and “BuildMotherBoard()”. The Director takes a builder polymorph as an argument and use its different members to assemble the composite.

I didn’t show this until just now because I wanted to go through the series of mental leaps and abstractions that would lead us to this point. It may not be necessary at first to launch into an all out implementation. There are steps along the way, and each introduces a new degree of abstraction that’s great and helpful as long as it’s not overkill.

This full blown builder pattern has all of the previous advantages, and the addition of the Director both adds complexity and customizability. You can now inject the director and/or the builders into your presentation class for maximum flexibility.

I’d also like to note that you can use this pattern to do some cool stuff with fluent interfaces. Imagine if the builders, instead of returning their components, all returned new Computer objects (assuming Computer had some kind of copy constructor or clone method). You would have an interface like

A More Official Explanation

So pulling back for a moment, let’s consider what we’ve really done here. What got us into trouble was having the methodology of the construction of our composite object, computer, mixed together with the definition of the composite components themselves. To speak plainly, we mixed the how with the what.

According to wikipedia, the main purpose of the Builder Pattern is “to abstract steps of construction of objects so that different implementations of these steps can construct different representations of objects”. Again, this describes separating the how (“construction of objects”) from the what (“different representations of the objects”).

Other Quick Examples

Other examples of using the builder pattern might include:

* You have some object that provides all of the services for your application and you want to supply different services for different runtime configurations, and you want to separate selection of the services from building and instantiating them.
* You’re modeling a pizza delivery service and you want to offer customizable pizzas but also some pre-defined ones that change perhaps by season or specials of the week or something.
* You’re assembling a multi-level composite.

A Good Fit – When to Use

More generally, a good heuristic for when to use the builder pattern is when construction of an object is complicated. In my experience, builder is generally appropriate if and only if you’re building a composite object, though YMMV. So, you might want to consider a builder pattern usage if your composite has temporal coupling as to how it needs to be built. You might also want to do it if your composite can have different configurations.

But, the most important and subtle point here is that you want to do this when there’s some notion of wanting to define discrete assembly characteristics – some assemblies work together but not others according to your business rules. In this sense, you’re separating business rules for valid configuration from the actual nuts and bolts construction of the object.

Square Peg, Round Hole – When Not to Use

Conversely to the idea of discrete assemblies, it’s not a good idea to use this if you have relatively infinite possibilities. For instance, let’s say you defined a “DateRange” struct that consisted of two DateTime objects. You don’t want a builder for this. All other things being equal, the only real restriction on creation of this object is that start date should probably not come after end date. You don’t need a builder to navigate the particulars of construction here — just a simple check in the struct’s constructor. You’d get more into builder territory as the rules for date creation tightened (say, you had a specific application where you could only choose from one of four start dates and one of four end dates or something).

Another bad reason to use this is if you don’t actually need it. We sort of pushed it with the example above — the abstraction of the BuildComputer() method might have sufficed for the requirements as stated up to that point. The pattern is powerful, but it introduces complexity, and it’s important to make sure the complexity isn’t of the needless variety.

And finally, I’d like to emphasize that there is a specific construction mode for where there is a far better design pattern — building polymorphs. Builder is for composites. Don’t use it for a situation where you have a “Food” base class and you read in strings like “IceCream” and “Steak” and create inheritors IceCream and Steak depending on what you read in. That calls for one variation or another of the factory pattern — a pattern specifically for figuring out how to create inheritors of a common base or implementers of a common interface.

So What? Why is this Better?

Object construction is one of those things that tends to creep up on you in terms of complexity. In the book “The Pragmatic Programmer: From Journeyman to Master”, Dave Thomas and Andy Hunt describe the boiling frog phenomenon. The idea is (and the authors swear never to have tried it) that if you put a frog in a boiling pot of water, it will, predictably, leap out. But, if you put it in cold water and increase the temperature gradually over time, the frog will remain there even when the temperature is eventually raised to boiling and cooks the frog.

That is how object creation tends to work in code bases. There is nothing simpler than a “var myClass = new MyClass()” declaration, so why spend a lot of time thinking about where to put it? Eventually, myClass’s constructor gets more complicated, and it picks up some initialization logic and other things, and then you need to create lists and dictionaries of it, and at some point it gets inheritors that you also need to create. Before you know it, your ad-hoc creation of these things is a nightmare.

As shown above, the Builder Pattern alleviates this problem. It separates the logic for providing components for a complicated object from the logic that governs the order of assembly of those components, and it separates both of those things from the logic that actually uses the components. When used in this fashion, changes to any one of those concerns do not affect the other two. And, decoupling and dependency management is the key to avoiding maintenance and feature-addition headaches.


In Search of the Perfect Code Review

Code Reviews

One thing that I tend to contemplate from time to time and have yet to post about is the idea of code reviews and what constitutes a good one. I’ve worked on projects where there was no code review process, a half-hearted review process, an annoying or counter-productive code review process, and a well organized and efficient code review process. It’s really run the gamut. So, I’d like to put pen to paper (finger to keyboard) and flesh out my ideas for what an optimal code review would entail. Before I can do that, however, I think I need to define some terms up front, and then identify some things about code reviews that I view as pitfalls, counter-productive activities or non-useful activities.

What is a Code Review?

According to the wikipedia entry for a code review, there are three basic types: the formal review, pair programming, and the lightweight review. I’ll add a fourth to this for the sake of pure definition, and call that the “automated review”. The automated review would be use of one or more static analysis tools like FX Cop or Style Cop (the wiki article includes someone using tools like this on the code of another developer, but I’m talking strictly automated steps like “you fail the build if you generate Style Cop warnings”). Pair programming is self explanatory for anyone familiar with the concept. Lightweight reviews are more relaxed and informal in the sense that they tend to be asynchronous and probably more of a courtesy. An example of this is where you email someone a source code file and ask for a sanity check.

The formal review is the one with which people are probably most familiar if code review is officially part of the SDLC on the project. This is a review where the developer sits in a room with one or more other people and presents written code. The reviewers go through in detail, looking for potential bugs, mistakes, failures to comply with convention, opportunities for improvement, design issues, etc. In a nutshell, copy-editing of code while the author is present.

What I’ll be discussing here mainly pertains to the formal review.

What I Don’t Like

Here are some things that I think ought to be avoided in a code review process.

1. Failing to have code reviews

This is important in the same way that QC is important. We’re all human and we could all use fresh eyes and sanity checks.

2. Procedural Code Review: The Nitpick

“You made that camel case instead of Pascal case.” “You could be dereferencing null there.” In general, anything that a static analysis tool could tell someone a lot faster and more accurately than a room full of people inspecting code manually. Suresh addresses this idea in a blog post:

I happen to see only “superficial” reviews happening. By superficial, I mean the types where you get review comments like, “You know the documentation for that method doesn’t have the version number”, or “this variable is unused”, etc.

“Superficial” is an excellent description as these things are generally trivial to identify and correct. It is not an effective use of the time of the developer or reviewers anymore than turning off spell check and doing it manually is an effective use of authors’ and editors’ time. There are tools for this — use them!

3. Paranoid Code Review

This is what happens when reviewer(s) go into the review with the notion that the only thing standing between a functioning application and disaster is their keen eye for the mistakes of others. This leads to a drawn out activity in which reviewers try to identify any possible, conceivable mistake that might exist in the code. It’s often easily identified by reviewers scrunching their noses, concentrating heavily, pointing, or tracing code through control flow statements with their fingers on the big screen while the developer twiddles his thumbs.

Again, there’s a tool for this. It’s called the unit test. It’s not flawless and it assumes a decent amount of coverage and competence from the unit tests, but if executed properly, the unit tests will express and prove corner case behavior far better than people on too little sleep staring at a screen and trying to step through the call stack mentally. This mental execution is probably the least reliable possible way of examining code.

4. Pure Gatekeeper Code Review

This is less of an individual property of code review, but more a property of the review process. It’s where you have a person or committee in the department that acts as the Caesar of code, giving thumbs up or thumbs down to anything that anyone submits. Don’t get me wrong — the aim of this makes sense. You want somebody that’s doing a sanity check on the code and someone who more or less has his or her finger on the pulse of everything that’s happening.

The issue that occurs here is a subtle one that results from having the same person or people reviewing all code. Specifically, those people become the gatekeepers and the submitters become less concerned about writing good code and innovating and more principally concerned with figuring out how to please the reviewer(s). Now, if the reviewers are sharp and savvy, this may not be a big deal, though discouraging new ideas and personal growth is always going to have some negative impact. However, if the reviewers are not as sophisticated, this is downright problematic.

This is relatively easily addressed by rotating who performs code reviews or else distinguishing between “suggestion” and “order”. I’ve participated in review processes where both of these mitigating actions were applied, and it’s a help.

5. Tyrant Gatekeeper

In the previous example, I mentioned a hypothetical gatekeeper reviewer or committee with ultimate authority, and this is a subset of that. In this case, the reviewer(s) have ultimate yes/no authority and are very heavy (and possibly even combative or derisive) with the “no” option. Where the previous example might stifle innovation or developer growth, this creates bottlenecks. Not only is it hard to get things approved, but developers will naturally start asking the reviewer(s) what to do at every turn rather than attempting to think for themselves.

In essence, this creates a state of learned helplessness. Developers are more concerned with avoiding negative feedback at the code reviews than learning, doing a good job, or becoming able to make good decisions based on their own experience. As a result, the developers don’t really make any decisions and ask the reviewer(s) what to do at every step, waiting until they have time, if necessary. The review(s) become a bottleneck in the process.

I have not personally witnessed or been subject to this form of code reviews, but I have heard of such a thing and it isn’t difficult to imagine this happening.

6. Discussion Drift

This occurs during a code review when a discussion of the specific implementation gets sidetracked by a more general discussion of the way things ought to be. Perhaps the code is instantiating an object in the constructor, and a reviewer recommends using dependency injection instead. From here, the participants in the review being to discuss how nice it would be if the architecture relied on a IOC framework instead of whatever it is at the moment.

That’s both a valid and an interesting discussion, but it has nothing to do with some developer checking in implementation code within the framework of the existing architecture. Discussions can have merit and still not be germane to the task at hand.

7. Religious Wars

This occurs during a code review in the following context:

Reviewer 1: You didn’t put curly brackets around that one liner following your if statement. Please add them.
Reviewer 2: No, don’t do that. I hate that.
Reviewer 1: No, it’s better. That way if someone changes the code and adds another statement…. etc.
Reviewer 2: Studies have shown….
Review-ee: Uh, guys….

And so on and so forth. Code reviews can easily devolve into this sort of thing over purely or nearly-purely subjective matters. People get very entrenched about their own subjective preferences and feel the need to defend them. We see this from political affiliation to rooting for sports teams. Subjective matters of preference in code are no different. Neither side is likely to convince the other during the scope of the review, but what is quite likely, and probably certain, is that time will be wasted.

If matters like that are part of the coding standards policy on the project, than it’s an open and shut case. If they’re not, they’re better left alone.

So, How Does a Good Code Review Go?

Having explained what I don’t like in a code review, I’ve provided some context for what I do find helpful. I’m going to outline a procedure that is simply an idea. This idea is subject to suggestions for improvement by others, and ongoing refinement from me. Also, this idea is geared toward the gatekeeper scenario. I may make another post on what I perceive to be the most effective method for voluntary(ish), peer-conducted code reviews where the reviewer(s) are not approval gate-keepers.

  1. Developer finishes the pre-defined unit of code and is ready to have it reviewed for promote/commit.
  2. Developer runs static analysis tools (e.g. StyleCop, FXCop, Code Contracts, NDepend, etc) with configured rule-sets, correcting any errors they uncover.
  3. Once no static-check violations are present, developer notifies reviewer(s) of desire for code review.
  4. Reviewers run static analysis tools asynchronously and reject the request if any rules are violated.
  5. Reviewers examine the code for obvious mistakes not caught by static analysis and/or developer unit tests, and write unit tests that expose the deficiency. (Alternatively, they can run something like Pex)
  6. Developer makes reviewer unit tests pass or else convinces reviewer(s) why they don’t need to.
  7. With all unit tests passing, and all parties familiar with the code, a review meeting is setup (meeting can be skipped for smaller/less crucial code deliveries).
  8. Meeting proceeds as follows:
    1. A mediator who is neither developer nor reviewer is asked to attend to keep the meeting focused and on track.
    2. Reviewers point out something praiseworthy about the submitted code (cheesy, perhaps, but important for starting off in the spirit of cooperation)
    3. Reviewers examine code for redundancy (is anything copy/pasted, defined in many places, etc)
    4. Reviewers examine the code for usable API, perhaps by implementing classes in a sandbox, to highlight shortcomings, unintuitive interactions, weird couplings, etc
    5. Reviewers check for architectural consistency — does the class implement a base class that it should, duplicate the function of some other class in the suite, seem to be in the wrong location, etc.
    6. Reviewers perform a dependency analysis — what new dependencies does this introduce? Cyclical, global, temporal, etc. Are these dependencies acceptable?
    7. Reviewers analyze for code smells and anti-patterns.
    8. Reviewers compile a list of suggested changes for the developer.
    9. Meeting adjourned.
  9. Developer makes suggested changes that he agrees with.
  10. Developer appeals changes with which he doesn’t agree. This could involve the reviewer(s) providing “proof”, for example if they say that the developer shouldn’t do something because of X negative consequence, they should demonstrate that consequence somehow. This appeal could be resolved via proof/demonstration or it could go to a third party where the developers and reviewers each state their cases.
  11. Any suggested changes and the results of any appeals are then promoted/committed.

This process naturally addresses (1) and (2) of the things that I don’t like in that you’re having a code review and getting the procedural, easy stuff out of the way offline, prior to meeting. (3) is made more difficult by the fact that the reviewer(s) are given the opportunity to write unit tests that expose the badness about which they might be paranoid. (4) and (5) are addressed by the appeal process and the general concept that changes are suggestions rather than decrees. (6) and (7) are addressed by the mediator who has no skin in the game and will probably have a natural tendency to want to keep things short and sweet.

One drawback I can see to what I’m proposing here is that you could potentially undercut the authority of the reviewer if the person doing the reviews is, say, the most senior or high ranking person. Perhaps people would want that role a bit less if they could be officially second guessed by anyone. However, I think that creates a nice environment where good ideas are valued above all else. If I were in that role myself, I’d welcome the challenge of having to demonstrate/prove ideas that I think are good ones before telling others to adopt them. In the long run, being steered toward that kind of rigor makes you better at your craft. I also think that it might be something of a non-issue, given that people who wind up in these types of leadership and senior roles are often there based on the merit of their work and on successfully “proving” things over and over throughout the course of a career.


A Better Metric than Code Coverage

My Chase of Code Coverage

Perhaps it’s because fall is upon us and this is the first year in a while that I haven’t been enrolled in a Master’s of CS program (I graduated in May), I’m feeling a little academic. As I mentioned in my last post, I’ve been plowing through following TDD by the letter, and if nothing else, I’m pleased that my code coverage is more effortlessly at 100%. I try to keep my code coverage around 100% whether or not I do TDD, so the main difference I’ve noticed is that TDD versus retrofitted tests seems to hit my use cases a lot harder, instead of just going through the code at least once.

Now, it’s important to me to get close to or hit that 100% mark, because I know that I’m at least touching everything going into production, meaning that I don’t have anything that would blow up if the stack pointer ever got to it, and I’m saved only by another bug preventing it from executing. But, there is a difference between covering code and exercising it.

More than 100% Code Coverage?

As I was contemplating this last night, I realized that some lines of my TDD code, especially control flow statements, were really getting pounded. There are lines in there that are covered by dozens of tests. So, the first flicker of an idea popped into my head — what if there were two factors at play when contemplating coverage: LOC Covered/Total LOC (i.e. our current code coverage metric) and Covering tests/LOC (I’ll call this coverage density).

High coverage is a breadth-oriented thing, while high density is depth — casting a wide net versus a narrow one deeply. And so, the ultimate solution would be to cast a wide net, deeply (assuming unlimited development time and lack of design constraints).

Are We Just Shifting the Goalposts?

So, Code Density sounded like sort of a heady concept, and I thought I might be onto something until I realized that this suffered the same potential for false positive feedback as code coverage. Specifically, I could achieve an extremely high density by making 50 copies of all of my unit tests. All of my LOC would get hit a lot more but my test suite would be no better (in fact, it’d be worse since it’s now clearly less efficient). So code coverage is weaker as a metric when you cheat by having weak asserts, and density is weaker when you cheat by hitting the same code with identical (or near identical) asserts.

Is there a way to use these two metrics in combination without the potential for cheating? It’s an interesting question and it’s easy enough to see that “higher is better” for both is generally, but not always true, and can be perverted by developers working under some kind of management edict demanding X coverage or, now, Y density.

Stepping Back a Bit

Well, it seems that Density is really no better than Code Coverage, and it’s arguably more obtuse, or at least it has the potential to be more obtuse, so maybe that’s not the route to go. After all, what we’re really after here is how many times a line of code is hit in a different scenario. For instance, hitting the line double result = x/y is only interesting when y is zero. If I hit it 45,000 times and achieve high density, I might as well just hit it once unless I try y at zero.

Now, we have something interesting. This isn’t a control flow statement, so code coverage doesn’t tell the whole story. You can cover that line easily without generating the problematic condition. Density is a slightly (but not much) better metric. We’re really driving after program correctness here, but since that’s a bit of a difficult problem, what we’ll generally settle for is notable, or interesting scenarios.

A Look at Pex

Microsoft Research made a utility called Pex (which I’ve blogged about here). Pex is an automated test generation utility that “finds interesting input-output values of your methods”. What this means, in practice, is that Pex pokes through your code looking for edge cases and anything that might be considered ‘interesting’. Often, this means conditions that causes control flow branching, but it also means things like finding our “y” div by zero exception from earlier.

What Pex does when it finds these interesting paths is it auto-generates unit tests that you can add to your suite. Since it finds hard-to-find edge cases and specializes in branching through your code, it boasts a high degree of coverage. But, what I’d really be interested in seeing is the stats on how many interesting paths your test suite cover versus how many there are or may be (we’d likely need a good approximation as this problem quickly becomes computationally unfeasible to know for certain).

I’m thinking that this has the makings of an excellent metric. Forget code coverage or my erstwhile “Density” metric. At this point, you’re no longer hoping that your metric reflects something good — you’re relatively confident that it must. While this isn’t as good as some kind of formal method that proves your code, you can at least be confident that critical things are being exercised by your test suite – manual, automated or both. And, while you can achieve this to some degree by regularly using Pex, I don’t know that you can quantify it other than to say, “well, I ran Pex a whole bunch of times and it stopped finding new issues, so I think we’re good.” I’d like a real, numerical metric.

Anyway, perhaps that’s in the offing at some point. It’d certainly be nice to see, and I think it would be an advancement in the field of static analysis.


Adventures in Pure Test-Driven Development

In a previous post some time back, I had expressed some skepticism about TDD as a design practice. I talked about test-driven development and its relationship with prototyping and the “make one to throw away” concept.

Since I’m not one ever to believe that I’ve arrived at the optimal solution, I’m doing another round of pure TDD to see if I can ‘refactor’ my process a little and find improvements in efficiency and/or code quality. As I’m doing this, I’m finding that I’m a little more proficient in the TDD concept than I was in previous attempts, but I’m still not convinced that TDD is a golden hammer appropriate for all solutions, all the time.

I’m going to make a list of pros and cons here as I work, both for myself and for posterity. Once I’ve compiled the list, I’ll wait a little bit, reflect, and write a conclusion on it (so yes, the part at the bottom will be written two or days removed from what you’re reading right now).


  • Using mocking frameworks (Moles and Moq), maintaining 100% code coverage is effortless
  • Tests tend to be more meaningful for your use cases than retrofitted ones.
  • Better gaurding against regressions
  • TDD is better even than I remember at forcing me to think through various scenarios of method control flow and object state. I’m having far fewer cases of, “oh, I didn’t consider scenario X!”
  • Cuts down on potential for dead code. I’m not doing things like declaring a setter because I’m already declaring a getter and I might as well. The rules of the process don’t let you code that until you have a test for it.
  • Tends to help generate useful and well-conceived abstract utility classes (such as a no-exception wrapper for string splitting that I wrote) that can be reused elsewhere.


  • Larger design concepts do not seem to be emergent the way they might be in a more shoot-from-the-hip approach. Figuring out when you need a broader pattern seems to require stepping out of the TDD frame of mind
  • There is no specific element of the process that naturally pushes me away from the happy path. You wind up creating tests for exceptional/edge cases the way you would without TDD (I guess that’s not strictly a con, just a lack of pro).
  • It becomes easier to keep plodding away at an ultimately unworkable design. It’s already hard to throw work you’ve put in away, and now you’re generating twice as much code.
  • Correcting tests that have been altered by changing requirements is magnified since your first code often turns out not to be correct. I find myself refactoring tests as often as code in the early going.
  • It has the potential to slow development when you’re new to the process. I suppose one might look at this as a time investment to be paid off when you get more fluid and used to doing it.


I like aspects of this process, and I like it enough that I’m going to keep making stabs at it to find exactly how it works best for me. In my earlier post, I mentioned that I’d do prototyping without tests and then do TDD in earnest when I started writing my “for-real” code. I think I’m going to try to phase it in increasingly to my prototyping as well.

My biggest issue is my perception that TDD removes more emergent macroscopic principles of design. There is no straightforward refactoring I’m aware of that leads to a MVVM pattern of development or to using something like a flyweight. So it’s going to be up to me and my refinement of my own process to reconcile Uncle Bob’s three laws of TDD with sophisticated design concepts.

Of course, I’d be interested to hear, in comments, anyone’s take on how to reconcile these two concepts. Perhaps I just haven’t been at it long enough to see a test that I can write that demands a refactoring to a more complex pattern. Or, perhaps I’ve become so used to recognizing good fits for patterns ahead of time that it’s no longer part of a refactoring for me, but the original ‘factoring’. In that case, I suppose I’d have to retool my methodology for arriving at design patterns.

Regardless, though, it should be an interesting endeavor. I like to shake up my way of doing things as often as possible to look for opportunities for improvement.