DaedTech

Stories about Software

By

Developer Process Gerrymandering

Gaming the System

As projects get a little behind schedule or perhaps a little contentious in terms of scope, I’d say it’s common for people to start taking steps to indemnify themselves against blame. I’d also say that this is fairly natural. There’s nothing more likely to attract wagging management fingers than a project behind schedule and/or over budget, so it makes sense to do a little rehearsing as to what one is going to tell them. Furthermore, it’s somewhat natural to try to showcase that one’s own efforts were a general positive even against a negative backdrop.

This is reminiscent of the “plus/minus” tracking in sports such as basketball, where it’s possible for a team to outscore its opponent when player X is on the floor, but still lose. The most common reason for this is that player X is a star, but the rest of the team is so bad that even his or her stellar play cannot overcome the shortcomings of the rest of the team. Long story short is that every developer wants to be that player X on a winning team if possible. But, absent that, being player X on a losing team is fine, so long as management and peers notice.

In such a situation (potentially “losing effort”), I’ve recently observed the following developer behaviors:

  1. Obvious (to me, if not management) estimate sandbagging.
  2. Wheedling, begging, berating, manipulating and generally badgering defect reporters into retracting reported defects.
  3. Refusal to fix obvious and embarrassing problems with the software because said solutions weren’t “in the requirements statement”.

These behaviors aren’t particularly difficult to understand in the context of seeking to create the impression of a better “plus/minus” score for a developer. Consider the following explanations:

  1. Estimate sandbagging allows a developer to finish way ahead of schedule, creating a heroic sort of aura for a time
  2. Improving code quality is one way to reduce the number of defects reported against one’s code. Browbeating people into not reporting the defects is another way to do this and more expedient in the short term, in a sense.
  3. Refusing to address something on the grounds that “I never had a requirement for this” is a nice two-fer; it creates the impression that not addressing the shortcoming was a conscious decision rather than an oversight and it also identifies a different blame target (whoever is responsible for requirements).

This frank categorization may induce a chuckle or two, but I doubt I’m breaking any drastically new ground for people who work in programming shops where this kind of thing probably occurs with frequency somewhere between “here and there” and “constant”. I do think there is value in coining some terms around it though so that we can examine the issues it causes, its own root cause, and some potential solutions.

Elbridge Gerry, An Inspiration

In the early 1800’s Massachusetts had a governor by the name of Elbridge Gerry. A Democratic-Republican by party affiliation, Gerry saw trouble in the tea leaves for his own party in the state senate and decided to do something about it. You might think that he decided to take his case to the people, touting the benefits of his party and convincing them to vote for Democratic-Republicans. You’d be wrong though.

Convincing the public to like you is hard and time consuming and there’s no guarantee of success. Gerry had a more expedient idea. He simply signed a law that radically altered the voting districts in his state in such a way that a majority would result in the state senate without actually having the voters to support it. One of the newly created districts was so contorted and weirdly shaped that political opponents said it looked like salamander. Gerry’s Salamander became the term “Gerrymander” and the phrase stuck around. Today it retains its definition as a sleazy but legal practice in which politicians stack the deck in favor of their own party as they’re dealing out political cards to the voters.

Getting back to the topic of software development, the behaviors I mentioned in the first section and their attendant explanations are all examples of a behavior similar in intentions to gerrymandering. In our world, the ideal for a software group is obviously to write good code, provide good estimates, and deliver quality software on time for a good value. As an individual the ideal is to have a good “plus/minus” within the group toward that same end — to write better than average code, delivered more quickly than average for less money than average. That’s all noble, but it’s also hard. And so some developers prefer to emulate Gerry and change the rules of the game (distort time, change definitions, employ technicalities) rather than playing it well. In the “plus/minus” world, this is like lobbying to play against the third string of the other team to pad one’s statistics.

Eliminate the “Plus/Minus”, Eliminate the Gerrymandering

So, if you’re managing a development team, how do you eliminate the counter-productive gerrymandering behavior? Eliminate the individual “plus/minus” ratio mentality. Imagine that you have a software project and you tell everyone working on it that their performance and pay/bonus were going to be dictated by the ranking on a scale of 1-10 that users gave the software in a survey (encompassing usability, quantity/quality of features, etc). Let’s think about what happens to the three behaviors at the start of the post:

  1. Estimate sandbagging is pointless because coming in ahead of a schedule that you’ve defined has no bearing on your bonus and performance review.
  2. Badgering defect reporters makes no sense because the goal is fewer defects in the user’s hands rather than fewer defects reported against the individual.
  3. Refusal to fix problems (finger pointing) is also silly because whining that someone else is at fault is cold comfort in the face of a bad review and no bonus

Now, I don’t necessarily believe that there will be entirely smooth sailing in this or any other construct. But, eliminating the focus on individual incentives and performance has some nice side effects including the one on which we’re focused: removing the tendency to try to game the system without adding value. (Another nice side effect is that you tend to downplay the unfortunate “rockstar” designation that panders to megalomaniacal personality types who tend drastically to overestimate their own irreplaceability and scare others into the same).

But How to Measure Individual Performance

Still, the question of individual performance evaluation remains. It’s all well and good to reward or punish the whole team as a unit, but there are HR org charts, career maps and other individual concerns to think of. So how do you address this while still avoiding the gerrymandering?

It’s all about the code and source control. There are a lot of metrics out there for evaluating developer productivity from the obtuse (counting lines of code generated or altered) to the anecdotal (does this guy seem to do a good job), but I’m not really aware of one that captures the truth as well as looking at someone’s changesets/commits in the context of a code base. So, what if you had some impartial and extremely knowledgeable party read through a code base, looking at its architecture and tendencies, and then read through the change sets to give each developer some kind of rating or at least partial rating.

It seems almost as if you could create a cottage industry out of this and offer it as a consulting service. This way, the consultant is truly removed from any office politics and can focus entirely on the code. If one did this as a craft, I’d imagine that the addition of data points to the general corpus would cause the process to become quite refined over the course of time and probably generate some relatively objective, interesting and meaningful metrics.

Of course, this wouldn’t be perfect. It’s only as good as the evaluator is sharp, and it has the shortcoming of not capturing “intangibles” on a project — maybe someone wrote little code but spent a lot of time helping other people with source control and other technical issues, for instance. Code and changesets are mute witnesses to the story of the project, but they don’t witness everything.

Still, I think the notion is an interesting enough one that it bears exploring. Developer (or any team member) posturing and gerrymandering are counter-productive activities that put the developer’s interests above the projects but, more problematically, misalign those two sets of interests. The combination of team effort and metric individual evaluation could help prevent that sort of thing.

By

Abstractions Are Important 5 – Type Consistency

For the fifth post in this series, I’m going to start with a mini rant.

A Digression (Rant) About Enum

Over the last couple of years, enumerations/enums have been dying a slow death in the world of code that I write. I wasn’t every really an avid user of them, but they’ve definitely been declining even for me to the point of virtual-non existence. I’m not sure exactly what it is about them or about me that’s spurring this, but I’m not sorry about it at all. I don’t miss them.

I think perhaps the motivation has been an increasing desire to use polymorphism at all levels, even when the object I’m making doesn’t seem “classworthy”. That is, why would I create a “CardSuit” enum when I can just create a first class type for Suit? I think perhaps another factor in this approach of mine is that enums tend to go hand-in-hand with switches and I consider this pairing to be an anti-pattern. Even with an enum like “Suit” that is really just representing mutual exclusion, this is inevitable somewhere:

switch(mySuit)
{
    case Suit.Spades:
        return "Spades";
    case Suit.Diamonds: 
        return "Diamonds";
//etc
}

And, why do that when I could just have something like this:

public abstract class Suit
{
    public abstract string LongSuitName { get; set; }
//etc, as Suit needs additional behaviors that would otherwise go in switches
}

This way, I can later plop suit specific designations into inheritors to my heart’s content without hunting for switch statements littered throughout the code. (Also, for language agnostic readers, C# enums are a different beast than their Java counterparts — in C#, they’re really just glorified collections of constants).

Apparently, I’m not alone in this sentiment. Just to stir things up, I googled “C# enums are evil” and came up with this interesting link from stackoverflow guru Jon Skeet:

Since working on a Java project last year, I’ve been increasingly fed up with C#’s enums. They’re really not very object oriented: they’re not type-safe (you can cast from one enum to another via a cast to their common underlying type), they don’t allow any behaviour to be specified, etc. They’re just named constant integral values. Until I played with Java 1.5’s enum support, that wouldn’t have struck me as being a problem, but (at least in some cases) enums can give you so much more.

It’s good to see that the man who defines “10” when a recruiter or someone asks you to rate yourself 1-10 on strength in a language feels the same way as me. And, I think he really nails it with the non object-oriented comment. Enums make me feel like I’m writing kernel code in C or something when I use them.

If You’re Going to Use Them…

All that said, it’s not as if they’re going to be excised from the language tomorrow — we might as well make sure they’re used in a way that makes sense if and when they are used. In a code base that I’m in from time to time (and I’m obfuscating the problem domain a bit, but leaving the intent and meaning intact), the concept of “side” exists in the sense that anything in the domain must be left or right. Think of it as though we’re shopping for a pair of shoes. Here is how “side” is represented:

public enum Side
{
    None,
    Left,
    Right,
    Both
}

At first blush, this makes sense. We might have no shoes on, the left shoe, the right shoe, or both shoes. But, ask yourself this: “what is ‘side’ and what is it made of?” Left and right are somewhat mundane, but what about none and both? Is “none” a side? Is “both”? Do I put my “none” shoe on before the left shoe and right shoe, at which time I put on the “both” shoe? Can you infer the usage of this thing from looking at it? No, you really can’t…

And the reason you can’t infer the usage is that the enum consists of two sides and two expressions of quantity of sides. This enum is a chameleon — depending on where you’re standing and what part of it you’re looking at, it can be two different things. And, I submit that this is bad — if you’re going to use enums at all, use them to represent simple, mutually exclusive concepts (like the aforementioned “Suit” in the problem domain of playing cards).

What’s the Harm?

Let’s take a look at an example (stripped down for brevity) client of this enumeration:

public class Shoe
{

}

public class EnumClient
{
    private readonly Dictionary _shoes = new Dictionary();

    public void PutOnShoes(Side side)
    {
        if (side == Side.None)
            return;
        if (side == Side.Both)
        {
            PutOnShoes(Side.Left);
            PutOnShoes(Side.Right);
        }

        _shoes[side] = new Shoe();
    }
}

What we’ve got here is kind of cute and clever. If I want the client to put a shoe on, I pass in the side. But, if I want to tell it to put both on, then I can just pass in Side.Both, and the method knows how to handle this. Sweet! The only immediate drawback is the fact that we have to handle Side.None. Clearly that should be a no-op that clients should avoid, but it’s just as valid as any of the other things from the compiler’s perspective, so we manually no-op.

But, is this cute bit of recursion actually as sweet as it seems? What if we write another method like this called “TakeShoesOff”? What if we write a class called ShoeSalesman that retrieves pairs of shoes? We almost always want him retrieving both shoes for clients, so we’re probably going to want to perpetuate this pattern into his methods as well, probably by copy and paste to save time. How about a ShoeStoreCashier ringing up pairs of shoes? We can take care of that with our good buddy copy and paste too. There’s no method about shoes that we can’t handle that way!

But, wait a second? Isn’t this starting to be kind of a code smell? If this implementation is so sweet, why does it seem all wet in the face of DRY? If you have actually done this in hundreds or thousands of places, you should be getting a sinking feeling in your stomach right about now. That feeling is the feeling that your code is suddenly and subtly incredibly brittle. This cute little encoding over the enum is actually an algorithm that is everywhere in your code.

Let’s say that I want to get in on some cuteness myself. What I’d like to do is write a method and say, “I don’t care which side you pass me, so long as a side exists — I’ll just perform an operation on the first side that I have available, if any.”

public enum Side
{
    None,
    Left,
    Right,
    Both,
    Either
}

public class NewEnumClient
{
    private readonly Dictionary _shoes = new Dictionary();

    public void RemoveShoes(Side side)
    {
        if (side == Side.None)
            return;
        if (side == Side.Both)
        {
            RemoveShoes(Side.Left);
            RemoveShoes(Side.Right);
        }
        if (side == Side.Either)
            RemoveShoes(_shoes.First().Key);

        _shoes.Remove(side);
    }
}

I’ve now extended the cuteness to be more “flexible”, and I’m pretty pleased with myself, so I go ahead and deliver this code. No unit tests break, no problems emerge that I can immediately see, so life is good. But then, weird problems start to crop up after a while. People file bug reports saying that they add shoes and see nothing on the screen or that they remove shoes but nothing is deleted from the database. It’s kind of a mystery at first, but I’m forced to get to the bottom of this as the trickle of bug reports starts becoming an avalanche.

What’s going on?!? Well, what’s going on is that all of the Cute 1.0 code can’t handle the new enum I’ve defined for Cute 2.0. So, what does it do? Well, sometimes it skips it altogether and no ops. Sometimes it adds a new key to the dictionary — a key for which it never checks. Sometimes it throws some kind of exception where it falls into a “default” state in a switch statement someone has defined. Sometimes it throws up an error message box informing the user, “This should never happen — email Erik!” for the same reason, which is doubly bad since I’m probably going to be featured on the Daily WTF.

Wow, bummer. But, no big deal. I’ll just roll up my sleeves and upgrade Cute 1.0 to Cute 2.0 everywhere. How many can there be, right? Uh oh. Hope I’m not doing anything this weekend. For everything copy and paste programming lacks in being a good idea, it makes up for in its ability to spread through a code base like kudzu. It’s too late to revert Cute 2.0 since I now have clients that depend on it, so I’d better roll up my sleeves and get pastin’ because it’s going to be a long Saturday with some Mountain Dew and 7,400 methods that I need to change.

The Real Problem

It should be fairly obvious that any “pattern” requiring you to add 5 or 10 lines to the beginning of a bunch of methods is actually an anti-pattern, particularly if those lines are similar or identical. And some might stop here and say that getting cute in the first place here was the problem, but I contend that this is just a symptom. To tie this back in with the series, the real problem is one of abstractions.

As I asked earlier, what is “side”? Really, can you tell me? I mean if I have a “Customer” object in my domain, you’d probably say something like “that models a customer of the enterprise for which this application was written” or else maybe you’d just smack me upside the head for asking such an obtuse question. But for “Side” in Cute 1.0? Without using the word “side” in your definition? You might say “well, it represents where we can put a shoe” or “it represents the directions left and right”, and you’d be correct for two of the enums values. You might say “well, it represents a number of places that you can put a shoe” or “its a flag that you need to use to tell your methods how to behave” and you’d be right for the other two values. Hmmm….

I know! It’s a doo-dad you can use to index a hash! That’s probably the most accurate way to describe it because that is unfortunately true of all 4 values. Unfortunate because for two of the values, it makes no absolutely sense to do that — but at least it’s true, so we’re getting somewhere. At the end of the day, though, I think all you can really say of Side is “it’s an enum and it’s up to clients to figure out what to do with it.” And that, my friends, makes it a bad abstraction.

Here are some other enums that would be poor abstractions

public enum Stuff
{
    Grape,
    Twelve,
    Microwave,
    Restart,
}

public enum CardSuit
{
    Spades,
    Hearts,
    Diamonds,
    Clubs,
    Black,
    Red,
    NoCard
}

public enum Dance
{
    Foxtrot,
    Tango,
    Waltz,
    DoesntLikeToDance,
    Miscellaneous
}

If you’re going to use enums, they ought to be values that are mutually exclusive, constitute a complete set, and form a clear abstraction. The first one is obviously nonsense, but the second two are enums where the person writing/extending them is about to get cute. They’re about to take an enum that has a set of mutually exclusive values and reappropriate it to use as a flag to tell their client code how to behave. Think of the code that’s about to be written — things like “if the suit is black, then do it for spades and clubs” and “if the person passed in has favorite dance property set to doesn’t like to dance…”

If you find yourself doing these kinds of cute things ask yourself what you’re really trying to accomplish. In the suit case, wouldn’t it make more sense to parameterize a method so that you could pass in multiple suits for the client code to operate on, instead of hard-coding the iteration? In the dance case, maybe it’s time for dance to be a first class object so that a “FavoriteDance” property can be set to null or a null object.

In C#, enums are already pretty much screaming “hey, there’s something called polymorphism — use that instead of me!” Once you start adding cute, one-off flag encodings to them, you’re really dropping all pretense of enumeration being a suitable abstraction and going for the gusto with fake, procedural polymorphism forced on your clients. Please, for your sake and mine if we later work together, don’t do that. The world doesn’t have a “class reserve” ala oil that may be exhausted someday — make a class. You won’t regret it because you won’t be spending your weekends upgrading The Cute.

By

TDD Prevents Copy and Paste Programming

Copy, Paste, Fail, Repeat

Today, I was reviewing some code, and I saw a “before” that looked like this:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

The first thing that stuck out to me was that I really wanted to stick “var” in there a couple of times because I read the word “classifier” so many time it lost all meaning. The second thing that struck me was the after:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

private void RefreshClassifierEnabledStatus(bool value)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Enabled = value;
    }
}

If you’d like to play a game of “find the compiler warning”, by all means, go ahead, but I warn you — the next sentence is a spoiler. In the new code, “i” is never read anywhere, which will generate a warning about unused variables. Annoying, but not the end of the world, and a fairly quick fix.

But, how did the code get like this? I’ve isolated the change in a way that should make this fairly obvious if you think about it for a minute or two. Basically, someone took the first method, copy and pasted it, and modified the payload of the foreach to suit his or her needs. Of course, the new need didn’t involve reading the local index variable, so oops.

I talked with the developer whose code change it was, doing my due diligence in pointing out that this is a classic example of one of the many problems with copy/paste programming. I tried to think back to a time when I’d been burned by this sort of thing recently, when it dawned on me that this simply wouldn’t happen to me. I say that not because I’m some kind of purist that disables ctrl-V and never pastes any code anywhere (though I do keep this to a pretty strict minimum), but rather because this is simply a non-starter for someone who practices TDD.

TDD Is Full of Win

If I were going to write the method RefreshClassifierStatus, the first test that I would write is that Classifier[0].Enabled was false when I passed in false. That would go red, and then I’d write some obtuse code that set Classifier[0].Enabled to false. Next, I’d write a test that said it was true when I passed in true and, after failing, I’d set that property to the passed in value. Finally, I’d write a test that Classifier[Classifier.Count – 1].Enabled was true when true was passed in and suddenly I’d have a foreach loop executing properly.

As an aside, at this point, I’d set the list to null and write a test for how the method should behave when that happens. This seems not to have been considered in this code, which is another problem with copy/paste programming — it’s a multiplier for existing mistakes.

So where in this, exactly, do I introduce an unused variable? That’s never going to help me get any test to go green. Where in this do I ever copy and paste the method wholesale? That’s not the simplest thing that I can do. So, with these two guiding heuristics, followed rather strictly, it’s simply impossible for me to introduce needless noise into the code. TDD forces an impressive level of efficiency most of the time.

TDD is Full of Fast

Had I written this method, I would have done it in probably 5 minutes or so (and that’s a fairly pessimistic estimate, I think), generating 4 unit tests for my trouble. The person that wrote this, I’m guessing, spent 30 seconds or less doing it. So, I’m behind by 4.5 minutes. But, let’s add to that the time that this developer will spend after seeing the code review firing back up the development environment, navigating to the code, removing the line of code, re-compiling, and re-running to make sure that it’s still okay. I’m now only maybe 2 minutes behind.

Now, let’s also factor in that I just realized that problem about the null check, which I’m going to email over and suggest be fixed in the new and old methods. Now there are two test cases instead of 1, and I’m suddenly ahead in terms of development time.

Okay, so the time here is a bit contrived and it’s not as though mistakes can’t be made with TDD, but I can say that a lot fewer of them are. The point here is that copy/paste programming is supposed to be super fast. It’s what you do when there’s no time for good software development practice because you need to hurry. And yet, it’s really not that fast in the grand scheme of things. It just makes you busier and wrong more efficiently than being wrong by hand.

So, save yourself time, effort, and the headache of getting a reputation for sloppy work. Slow down and do it right. TDD isn’t required for this, but it sure helps.