Stories about Software


Methods Are Little Stories – Abstractions Are Important 6

If Then, If Then, If Then

Yesterday’s post where I included Grady Booch’s comment that clean code “reads like well written prose” made me think of something I’ve been contemplating. The other day I was looking at some code and I saw the following (obfuscated):

I automatically started refactoring this to the following:

and then:

To me, this keeps in line with Grady’s statement and is easy to reason about at every level. “If I need an umbrella, get it” and “I need an umbrella if it’s raining, I need to leave, and I’m parked in the street” are pieces of code so simple that one need not be a programmer to understand the logic. I think it’s hard to argue that this is less conversational than “If it’s raining then if I need to leave then if I am parked in the street then grab an umbrella.”

But does this matter? Am I just being fussy and shuffling around the code to no real benefit? Are there advantages to the “ifception” approach (thanks to Dan Martin for this term)? Why would someone prefer this style? These were the things that I found myself contemplating.

The Case For Ifception?

In order to understand possible advantages or reasons for this preference, I sought to figure out the motivation. My first thought was that someone would write code this way if they missed the week in discrete math/logic where DeMorgan’s laws and the rules of inference in Boolean Algebra were covered. However, I don’t like to assume incompetence or ignorance when the only evidence present is evidence only of a different preference than mine, so let’s dismiss that as a motivation.

The second thing that occurred to me was a lack of awareness or mistrust of the compiler short-circuiting and operations. To put it another way, they believe that all three conditions will be checked even if the first one fails, so the ifception is more efficient. But, again, this requires an assumption of ignorance, so let’s assume that the author understands conditional short-circuiting.

After that a slightly more valid motivation dawned on me (and one that doesn’t assume ignorance/incompetence) – the author loves debugger! That is, perhaps the code author likes it this way because he or she prefers to be able to step through the method and see the short circuiting or success in action.

As I poked around a little more, I found code in the same class of this form as well:

Two data points now seem to point to my conclusion. I believe the motivation here is Debugger Driven Development (DDD) — a term that I’ll use to describe the approach where you write production code specifically designed to be stepped through in the debugger. This is a rather pessimistic approach since it seems to say “when you’re dealing with my code you’re going to be in the debugger… a lot… seriously, I have no idea how or even if my code works.”

I will also allow for the possibility that someone might view these approaches as inherently more readable, but I can only imagine that’s the case as a result of familiarity. In other words, this style is only more readable if it’s what you expect to see — I doubt anyone not versed in programming would gravitate toward nested conditionals and/or return statements as approachable.

If anyone can think of an additional benefit that I’m missing, please let me know. Or, in other words:

Does It Really Matter?

So having tentatively identified a motivation for ifceptions (DDD), is this style preferable? Harmless? To be avoided? I actually wrestled with this for a while before forming my opinion. The style is very different from what I prefer and am used to, but I try very hard not to conflate “I’m not used to that” with “That’s bad”. Doing so is the height of arrogance and will greatly hinder one’s ability to learn.

That said, the conclusion I came to was that this should be avoided if possible. And the reason it should be avoided boils down to method level abstractions. A method should tell a story. The method “GrabUmbrellaIfNecessary” tells a story — it tells you that it’s going to figure out whether an umbrella is needed and grab it if so. As a client of that method, you’re going to take it at face value that it does what it advertises, but if you do decide to drill into the method, you’re expecting to see a concise implementation of what’s advertised.

In the factored example, that’s exactly what you see. What better captures “GrabUmbrellaIfNecessary” than a single if condition for “DoINeedAnUmbrella” followed by a “GrabUmbrella” for a true evaluation? But what about the ifception example? Well, I see that there’s a condition to see whether it’s raining or not and then a scoped block of code with another conditional. Oh, okay, if it’s raining, we’ll get in there and then we’ll see if I need to leave in which case we’ll get… another scoped block of code. Okay, okay, so now, we need to know where I’m parked and, what were we doing again? Oh, right, we’re seeing whether we need to get into another scoped block of code. Ah, okay, if we’re parked in the street, here’s the meat of the method – grab the umbrella!

Notice that in the ifception reading, you see words like “scope” and “block”. I’m having to think of scoping rules, brackets, nested conditionals, control flow and other language constructs. Each of these things has exactly nothing to do with whether I should bring my umbrella or not and yet I’m thinking of them. If you look at the flattened early return method, a similar thing is happening:

If it’s not raining, then return. Okay, assuming we’re still in the method, if it’s not true that I need to leave, then return. Okay, now if I’m still in the method, if I’m not parked in the street then return. Okay, if I’m still in the method, then get the umbrella.

“Return”? “In the method”? What does that have to do with whether I need an umbrella and getting that umbrella? And why do I care about “not is raining” if I’m trying to figure out whether to use an umbrella? If it’s not not raining, do I need an umbrella? I think… ugh.

An easy argument to make at this point is “Erik, you’re a programmer and you should understand things like scope and early returns — don’t be lazy.” While this is true, it’s also true that I’m capable of squinting and making out tiny, bright yellow font against a white background. In neither case is that enjoyable or a good use of my time and effort when it’s possible simply to have more clarity and ease of reading.

Programmers are paid to solve problems by handling and abstracting away complexity. This applies to end-users and fellow programmers as well. It’s easy to lose sight of this and believe that knowledge of particular languages, frameworks, etc is the end goal, but that knowledge is simply a tool used in pursuit of the actual end goal: problem solving. If methods are written in such a way as to make them read like “well written prose”, I don’t have to focus on language syntax and details. I don’t have to be acutely aware that I’m in the middle of a method, figuring out scope and/or when to return. Instead, I can focus on the business logic of my problem domain and meeting user requirements.

Method writing techniques that make language syntax an afterthought are good abstractions. They hide the bare-metal, nitty-gritty details of writing C# (or any language/framework) as much as possible, exposing only enough to facilitate understanding of the problems being solved. And while it’s never going to be possible to avoid all scoping, returning and other such method housekeeping, you can certainly arrange your methods in such a way as to minimize and hide it, rather than to distract readers by calling attention to it.


Learning to Love The Linq .Any() Method

There is a Linq extension method that I managed to miss for a long time: Any(). It’s fairly natural to miss this, I believe, since the paradigm of collections tends to be fully ingrained in our way of doing things. That is, if I want to see if any members of a collection match a certain criteria, I do something like this:

I cycle through my collection of features looking for “Direct” and, if I find one, the boolean flag is flipped to true. Otherwise, it remains false. If I wanted to get a little more efficient, I could break if the flag got flipped or do this in a two-condition while loop.

This hearkens back to the days of storing values in simple arrays in older languages than C# and before the existence of IEnumerable, foreach() and other such constructs. For those of us who learned this way, iterate and check is quite natural.

But then, Linq came along and gave us the handy-dandy Where(Func<T, bool>) extension method:

This accomplishes the same thing, but as a one-liner. That in and of itself is nice, but it gets doubly nice when you consider that the short-circuit optimization is built in to the implementation of Where(). Where() returns an enumeration, which is more or less a collection (yes, yes, I understand that it’s not really, but that’s how we tend to regard it), so the natural seeming thing to do is query it for a count. I remained content with this for a long time, until I discovered Any().

I can achieve the same thing with:

This is a lot more semantically clear. I’m not really interested in an integer comparison — I want to know whether there are any features matching my criteria. It may not seem important, but cutting down on noise in the code is very important for promoting readability since, in the excellent words of Grady Booch, “clean code reads like well-written prose.” In other words, this code says “do any features exist where the id is equal to ‘direct’?” as opposed to “is the count of features with id equal to ‘direct’ greater than zero?” The former reads a lot more conversationally than the latter.

But, we have one more improvement to make here:

In terms of prose-like, conversational reading, this is pretty much still “do any features exist where the id is equal to ‘direct’?” The “where” is implied here, I would say. But, now the code is more succinct, and there is one fewer method call. Brevity is generally good.

I’ve been using this for a while now, but I still encounter old code of mine where I wasn’t aware of this little gem and I encounter code written by others who aren’t aware of it. So, if you’re one of those people, be aware of it now and use in good health!



Quick Information/Overview

Pattern Type Structural
Applicable Language/Framework Agnostic OOP
Pattern Source Gang of Four
Difficulty Easy

Up Front Definitions

The Problem

Let’s say that you work on a web based piece of software that processes orders for customers in some sort of retail capacity and that there is a framework for a lot of existing operations with which you have to deal. You’re tasked with creating the UI for taking orders from customers. At your disposal, you have three handy functionalities: an order DAO for storing order information to the database, a receipt printer for printing receipts, and an email sender for sending confirmation emails. These things exist in different assemblies and your presentation layer assembly, fortunately, has references to all of them.

So, you write some code that looks like this:

Life is good. Your customer can now issue an order through your page and the view model that supports it. Making note of your success, your manager hands you a requirement to display recommendations to the screen following an order. Luckily, there is a module called RecommendationEngine that you can use for just such an occasion:

Excellent. Well, mostly, anyway. This class’s constructor is starting to get a little busy and it depends on sort of a hodgepodge of other assemblies. But, it’s probably nothing — adding that additional assembly reference for the recommendation engine was probably just an anomaly.

Besides, your manager is so impressed that it’s time to assign you to a different screen. As it turns out, there is the traditional order placement screen and a new express one for frequent customers. So, you need to write a new screen and ViewModel, which you do. Since you’ve already implemented the order execution logic, you just copy and paste it and the field declarations into the new ViewModel, initializing the fields in the constructor. You’re not thrilled about the copy and paste, but whatcha gonna do?

At this point, your manager points out a requirement that nobody had considered: users might want to access a virtual copy of their receipts later. You notice an assembly containing a class called ReceiptDao that sounds promising, and you add that:

Well, problem solved, though that dependency list is growing. The order logic now stores the receipt. You promote it and get a defect back from QA that the express ordering system doesn’t store virtual receipts the way the normal one does. Oops. Such is life when you copy and paste program — you forgot to change the other implementation. Ah, well, happens to the best of us.

But now, you get a series of assignments from the manager to add more functionality from other modules while simultaneously adding more places from which an order can be placed. That means adding a lot of code to the two implementations that already exist, bloating them, and then copying and pasting the resulting monstrosity to various other locations.

So, What to Do?

As always in this section, it’s important to identify where things went wrong. It’s debatable as to whether adding that fourth dependency was a problem or not — that constructor is getting pretty busy at this point and the dependency fan out is climbing. However, in and of itself, that may not be a problem.

It stops being debatable (as it always does) when they copying and pasting starts. That’s bad. Any time you find yourself doing that you’re failing at design. In this case, the most obvious fix is the one that will lead us to a better path. Specifically, we need to factor the duplicated code into a class. So, let’s introduce a class called OrderService:

Now, this service class can be used by any ViewModel client without duplicating logic. Here’s what the client now looks like:

The only duplication that might occur here is injecting and invoking the service (though that can be fixed among the ViewModels, but that’s beyond the scope of this post). We’ve eliminated the duplication that makes things like adding the receipt storage a pain point. Also, observe that if we put the service in a separate assembly from the ViewModels, we can now remove the assembly references to the various lower level services such as EmailSender, ReceiptPrinter, etc.

While this separation may seem like pushing our problems out a layer, there is a real benefit to doing this. Specifically, presentation layer classes should not be concerned with locating lower level components. They should be allowed to focus on marshaling data for presentation. Following the single responsibility principle suggests that presenting order confirmation to the user and navigating the mechanics of the subsystems required for an order are two separate responsibilities.

The five (and growing) dependencies that have been added to the constructor of the service to spare the ViewModel are still problematic in terms of good design. This could be fixed with further facades, if the application is large enough to support conceptual sub-layers. That is, there could be a general DAO facade and a general externalities facade that took care of the dao functions and email/receipts/recommendations respectively. This scheme would have the service knowing only about two collaborators instead of five while deepening the call stack hierarchy.

Another approach might be to consolidate some of these assemblies or classes at lower layers, if possible. It might also be reasonable, depending on the situation, simply to leave this service class alone, having abstracted this sub-system out of the presentation layer.

A More Official Explanation

DoFactory defines Facade as a pattern that aims to:

Provide a unified interface to a set of interfaces in a subsystem. Façade defines a higher-level interface that makes the subsystem easier to use.

This is pretty straightforward to understand and it constitutes one of the most basic and fundamental patterns of programming. Facades provide a layer of abstraction, hiding low level details from client code in favor of broader, easier to understand interactions. I have blogged extensively about the subject of abstraction, if you’d like a more in-depth treatment of the concept.

In general, facades form the back bone of solid architecture in code bases by virtue of hiding conceptually difficult details behind conceptually simple points of access. The crux of this pattern is hiding and managing complexity.

Other Quick Examples

Other incarnations of the Facade Pattern include:

  1. The iconic example of mortgage applications where lenders, banks, inspectors, etc are all necessary to complete the single process.
  2. A unified or single class that you write to hide an ugly third-party API or series of APIs.
  3. In a very real sense, the structure of a class is a bit of a facade since implementation details (private fields and members) are hidden behind the facade of the public API.

A Good Fit – When to Use

Facade is good any time you want to hide details of a process from client code. This can apply, as I mentioned, at the assembly/module level and it can even apply at the class and method level. If you practice good software design this is a pattern that you will find yourself using constantly. It is the essence of providing good abstractions.

Square Peg, Round Hole – When Not to Use

The main usage anti-pattern of a facade is providing a layer or “wrapper” that really does nothing or fails to simplify things for clients. There can be a fine line between organizing code and hiding details and introducing pointless layering, so it is important to consider whether the facade that you’re adding is either (1) making things easier or (2) hiding details that should be hidden. If it’s doing neither of these things, it probably shouldn’t exist.

So What? Why is this Better?

Generally speaking, facades can help eliminate redundant logic and they can simplify interaction with a system. These are two of the most important considerations in all of software. Redundancy is the bane of maintainability and abstraction is the backbone of understandable systems. Well placed facades (such as layered architecture, for instance) provide both in one shot. The end result is that your code will be well factored and decoupled, easy to maintain, and easy to reason about.


Faking Tomcat Support in IntelliJ IDEA Community Version

A while back, I blogged about my switch from Eclipse to the community (free) version of IntelliJ IDEA. After that, I went on a number of trips out of town, felt out the job market a bit, and generally engaged in a series of activities that have kept me too busy to work on my home automation. In the last week, however, that’s changed and I’ve revisited it.

The first thing that I decided to do was play with Ant a bit to create an environment where I could build and run the application locally. I’m in a bit of a catch-22 with IntelliJ at the moment. I might be interested in purchasing the Ultimate Edition, but not before I take it out for a few months and kick the tires. But, in order to do that without pain, I kind of need the server support for what I’m doing, which is only available in Ultimate. So, I’m faking it. I created the following Ant build.xml:

The other build targets are somewhat standard, if memory serves (I was never particularly adept with Ant, and now I’m several years removed from using it). But, the interesting one is the “war” target, which I’m probably going to rename here shortly. This first compiles the project into a deployable WAR file and plops it into the local Tomcat webapps directory, which is where the it needs to be for me to run it locally in a browser.

With that in place, I’ve added the next line, which starts the web server. In the way I invoke it, a command window pops up, and I can stop the server by closing it. Once the server is started, I launch chrome to the URL where the application will be. This works, but I’d like to refine it a bit. Here are some things that I think should happen:

  1. First of all, DRY isn’t just for code. I’m defining “Daedalus.war” and “http://localhost:8080/Daedalus” and if I decide to change the deployable name, there are two failure points. I’m going to correct that.
  2. And, as long as I’m changing the “destfile” property of the war file, I might as well define a descriptive constant for the Tomcat webapps directory.
  3. Secondly, “browser” isn’t a particularly good property name for Chrome, so I’m going to rename it.
  4. Along those same lines, I’m going to create a property for Internet Explorer as well
  5. I’m also going to create build targets for chrome and internet explorer so that I can choose which one to run without modifying the build.xml file.
  6. Finally, I’m going to decouple the building of the War from the starting of the server and decouple both of those from opening a browser. That way I won’t have to repeat myself in the internet explorer and chrome versions.

Here’s what it looks like now:

So, I’m relatively pleased with this for the time being. Two small caveats with this approach, though. The first is that I don’t really like duplicating the “http://localhost:8080″ as the server’s localhost, but I’m unlikely ever to change this, and if I abstract that out to a constant, it fails for some reason that I haven’t seriously looked into (and realistically, probably won’t for now). The second is that I start the server, but you have to manually stop it. That’s kind of ick, but I’ll live with it for now. If/when I put together a better scheme, I’ll post it.

But, with this you get support for Tomcat with IntelliJ IDEA community edition. It’s obviously not nearly as snazzy as having it integrated to the IDE, but if you’re in my boat, this is nice. I’ve automated most of the stuff that I’d rely on the IDE to do, and I’ve done it without spending a few hundred dollars I’m not yet sure I can justify. Cheers if this helps you.


You Need CodeRush


The other day, I was chatting with some developers and one of them pulled me over to show me the following code:

I did a double take and then smirked a bit, pointing out that this was probably dead code. I asked the person who was showing me this to do a “find all references” and there actually was one:

So, we then backed it up a step and discovered that this was an event handler for an event declared in XAML (this whole thing was in a code-behind file). So, every time some event occurred (some sort of navigation thing), the code would cast the sender as a page, call a method with that page, and then return if the page was null, and also return if the page wasn’t null.

How Did It Come To This?

Looking at the SetLayout method, I quickly developed a hypothesis as to how the code wound up this way. The “SetLayout” method probably checked the page for null as a precondition prior to performing some sort of action on Page. At some point, that action became undesirable and was removed from the code, but whoever removed it didn’t realize that the precondition check was now meaningless, as was the whole chain of code above it for the event handler. This was confirmed correct by a quick glance through source control.

Here’s what this looks like in my IDE:

Do you see the dashes on the right, next to the scroll bar? Those are the issues in the CodeRush “issues list”. Do you see the squiggly underline of “SetLayout” (hard to see at this resolution, but you can zoom in or open the image in a new tab if you want)? That means there’s an issue here. In the interests of full disclosure, this is actually to tell me that the method could be static (it operates on no instance member of the class) rather than to tell me that the method has no effect on anything, but the important thing is that it’s underlined. It draws my attention to the method instead of encouraging me to skip over it while looking for something else here. And, as soon as this method catches anyone’s attention, it will obviously be deleted.

The Value of Productivity Tools

In general, the Code Rush issues list is a great companion. It alerts me to dead code, undisposed disposables, redundant statements, and plenty of other things. If you pause any time it tells you about an issue, you will become better at programming and you will start to recognize more problems. It isn’t necessarily superior to using, say, Style Cop or any other particular tool, but it is very convenient in that it paints the problems right inline with the IDE. They’re quite noticeable.

Code Rush’s competitor, Resharper does this as well. I plug for Code Rush because I use it and I love it, but if you get Resharper, that’s fine — just get something. These productivity tools are game changers. When looking at this code in someone else’s IDE, there was no squiggly and no immediately obvious problem if you’re casually scrolling through the class. But in my IDE, my eyes are drawn right to it, allowing and in fact demanding that I fix it. This sort of boy-scouting makes you better at writing your own code and better at fixing other people’s.

I forget what the license costs, but your employer can afford it. Heck, if they’re too cheap, you can afford it. Because what you can’t afford is to have code checked in under your name that looks like the code in this post. That’s no good, and it’s avoidable.


Developer Process Gerrymandering

Gaming the System

As projects get a little behind schedule or perhaps a little contentious in terms of scope, I’d say it’s common for people to start taking steps to indemnify themselves against blame. I’d also say that this is fairly natural. There’s nothing more likely to attract wagging management fingers than a project behind schedule and/or over budget, so it makes sense to do a little rehearsing as to what one is going to tell them. Furthermore, it’s somewhat natural to try to showcase that one’s own efforts were a general positive even against a negative backdrop.

This is reminiscent of the “plus/minus” tracking in sports such as basketball, where it’s possible for a team to outscore its opponent when player X is on the floor, but still lose. The most common reason for this is that player X is a star, but the rest of the team is so bad that even his or her stellar play cannot overcome the shortcomings of the rest of the team. Long story short is that every developer wants to be that player X on a winning team if possible. But, absent that, being player X on a losing team is fine, so long as management and peers notice.

In such a situation (potentially “losing effort”), I’ve recently observed the following developer behaviors:

  1. Obvious (to me, if not management) estimate sandbagging.
  2. Wheedling, begging, berating, manipulating and generally badgering defect reporters into retracting reported defects.
  3. Refusal to fix obvious and embarrassing problems with the software because said solutions weren’t “in the requirements statement”.

These behaviors aren’t particularly difficult to understand in the context of seeking to create the impression of a better “plus/minus” score for a developer. Consider the following explanations:

  1. Estimate sandbagging allows a developer to finish way ahead of schedule, creating a heroic sort of aura for a time
  2. Improving code quality is one way to reduce the number of defects reported against one’s code. Browbeating people into not reporting the defects is another way to do this and more expedient in the short term, in a sense.
  3. Refusing to address something on the grounds that “I never had a requirement for this” is a nice two-fer; it creates the impression that not addressing the shortcoming was a conscious decision rather than an oversight and it also identifies a different blame target (whoever is responsible for requirements).

This frank categorization may induce a chuckle or two, but I doubt I’m breaking any drastically new ground for people who work in programming shops where this kind of thing probably occurs with frequency somewhere between “here and there” and “constant”. I do think there is value in coining some terms around it though so that we can examine the issues it causes, its own root cause, and some potential solutions.

Elbridge Gerry, An Inspiration

In the early 1800’s Massachusetts had a governor by the name of Elbridge Gerry. A Democratic-Republican by party affiliation, Gerry saw trouble in the tea leaves for his own party in the state senate and decided to do something about it. You might think that he decided to take his case to the people, touting the benefits of his party and convincing them to vote for Democratic-Republicans. You’d be wrong though.

Convincing the public to like you is hard and time consuming and there’s no guarantee of success. Gerry had a more expedient idea. He simply signed a law that radically altered the voting districts in his state in such a way that a majority would result in the state senate without actually having the voters to support it. One of the newly created districts was so contorted and weirdly shaped that political opponents said it looked like salamander. Gerry’s Salamander became the term “Gerrymander” and the phrase stuck around. Today it retains its definition as a sleazy but legal practice in which politicians stack the deck in favor of their own party as they’re dealing out political cards to the voters.

Getting back to the topic of software development, the behaviors I mentioned in the first section and their attendant explanations are all examples of a behavior similar in intentions to gerrymandering. In our world, the ideal for a software group is obviously to write good code, provide good estimates, and deliver quality software on time for a good value. As an individual the ideal is to have a good “plus/minus” within the group toward that same end — to write better than average code, delivered more quickly than average for less money than average. That’s all noble, but it’s also hard. And so some developers prefer to emulate Gerry and change the rules of the game (distort time, change definitions, employ technicalities) rather than playing it well. In the “plus/minus” world, this is like lobbying to play against the third string of the other team to pad one’s statistics.

Eliminate the “Plus/Minus”, Eliminate the Gerrymandering

So, if you’re managing a development team, how do you eliminate the counter-productive gerrymandering behavior? Eliminate the individual “plus/minus” ratio mentality. Imagine that you have a software project and you tell everyone working on it that their performance and pay/bonus were going to be dictated by the ranking on a scale of 1-10 that users gave the software in a survey (encompassing usability, quantity/quality of features, etc). Let’s think about what happens to the three behaviors at the start of the post:

  1. Estimate sandbagging is pointless because coming in ahead of a schedule that you’ve defined has no bearing on your bonus and performance review.
  2. Badgering defect reporters makes no sense because the goal is fewer defects in the user’s hands rather than fewer defects reported against the individual.
  3. Refusal to fix problems (finger pointing) is also silly because whining that someone else is at fault is cold comfort in the face of a bad review and no bonus

Now, I don’t necessarily believe that there will be entirely smooth sailing in this or any other construct. But, eliminating the focus on individual incentives and performance has some nice side effects including the one on which we’re focused: removing the tendency to try to game the system without adding value. (Another nice side effect is that you tend to downplay the unfortunate “rockstar” designation that panders to megalomaniacal personality types who tend drastically to overestimate their own irreplaceability and scare others into the same).

But How to Measure Individual Performance

Still, the question of individual performance evaluation remains. It’s all well and good to reward or punish the whole team as a unit, but there are HR org charts, career maps and other individual concerns to think of. So how do you address this while still avoiding the gerrymandering?

It’s all about the code and source control. There are a lot of metrics out there for evaluating developer productivity from the obtuse (counting lines of code generated or altered) to the anecdotal (does this guy seem to do a good job), but I’m not really aware of one that captures the truth as well as looking at someone’s changesets/commits in the context of a code base. So, what if you had some impartial and extremely knowledgeable party read through a code base, looking at its architecture and tendencies, and then read through the change sets to give each developer some kind of rating or at least partial rating.

It seems almost as if you could create a cottage industry out of this and offer it as a consulting service. This way, the consultant is truly removed from any office politics and can focus entirely on the code. If one did this as a craft, I’d imagine that the addition of data points to the general corpus would cause the process to become quite refined over the course of time and probably generate some relatively objective, interesting and meaningful metrics.

Of course, this wouldn’t be perfect. It’s only as good as the evaluator is sharp, and it has the shortcoming of not capturing “intangibles” on a project — maybe someone wrote little code but spent a lot of time helping other people with source control and other technical issues, for instance. Code and changesets are mute witnesses to the story of the project, but they don’t witness everything.

Still, I think the notion is an interesting enough one that it bears exploring. Developer (or any team member) posturing and gerrymandering are counter-productive activities that put the developer’s interests above the projects but, more problematically, misalign those two sets of interests. The combination of team effort and metric individual evaluation could help prevent that sort of thing.


Abstractions Are Important 5 – Type Consistency

For the fifth post in this series, I’m going to start with a mini rant.

A Digression (Rant) About Enum

Over the last couple of years, enumerations/enums have been dying a slow death in the world of code that I write. I wasn’t every really an avid user of them, but they’ve definitely been declining even for me to the point of virtual-non existence. I’m not sure exactly what it is about them or about me that’s spurring this, but I’m not sorry about it at all. I don’t miss them.

I think perhaps the motivation has been an increasing desire to use polymorphism at all levels, even when the object I’m making doesn’t seem “classworthy”. That is, why would I create a “CardSuit” enum when I can just create a first class type for Suit? I think perhaps another factor in this approach of mine is that enums tend to go hand-in-hand with switches and I consider this pairing to be an anti-pattern. Even with an enum like “Suit” that is really just representing mutual exclusion, this is inevitable somewhere:

And, why do that when I could just have something like this:

This way, I can later plop suit specific designations into inheritors to my heart’s content without hunting for switch statements littered throughout the code. (Also, for language agnostic readers, C# enums are a different beast than their Java counterparts — in C#, they’re really just glorified collections of constants).

Apparently, I’m not alone in this sentiment. Just to stir things up, I googled “C# enums are evil” and came up with this interesting link from stackoverflow guru Jon Skeet:

Since working on a Java project last year, I’ve been increasingly fed up with C#’s enums. They’re really not very object oriented: they’re not type-safe (you can cast from one enum to another via a cast to their common underlying type), they don’t allow any behaviour to be specified, etc. They’re just named constant integral values. Until I played with Java 1.5’s enum support, that wouldn’t have struck me as being a problem, but (at least in some cases) enums can give you so much more.

It’s good to see that the man who defines “10” when a recruiter or someone asks you to rate yourself 1-10 on strength in a language feels the same way as me. And, I think he really nails it with the non object-oriented comment. Enums make me feel like I’m writing kernel code in C or something when I use them.

If You’re Going to Use Them…

All that said, it’s not as if they’re going to be excised from the language tomorrow — we might as well make sure they’re used in a way that makes sense if and when they are used. In a code base that I’m in from time to time (and I’m obfuscating the problem domain a bit, but leaving the intent and meaning intact), the concept of “side” exists in the sense that anything in the domain must be left or right. Think of it as though we’re shopping for a pair of shoes. Here is how “side” is represented:

At first blush, this makes sense. We might have no shoes on, the left shoe, the right shoe, or both shoes. But, ask yourself this: “what is ‘side’ and what is it made of?” Left and right are somewhat mundane, but what about none and both? Is “none” a side? Is “both”? Do I put my “none” shoe on before the left shoe and right shoe, at which time I put on the “both” shoe? Can you infer the usage of this thing from looking at it? No, you really can’t…

And the reason you can’t infer the usage is that the enum consists of two sides and two expressions of quantity of sides. This enum is a chameleon — depending on where you’re standing and what part of it you’re looking at, it can be two different things. And, I submit that this is bad — if you’re going to use enums at all, use them to represent simple, mutually exclusive concepts (like the aforementioned “Suit” in the problem domain of playing cards).

What’s the Harm?

Let’s take a look at an example (stripped down for brevity) client of this enumeration:

What we’ve got here is kind of cute and clever. If I want the client to put a shoe on, I pass in the side. But, if I want to tell it to put both on, then I can just pass in Side.Both, and the method knows how to handle this. Sweet! The only immediate drawback is the fact that we have to handle Side.None. Clearly that should be a no-op that clients should avoid, but it’s just as valid as any of the other things from the compiler’s perspective, so we manually no-op.

But, is this cute bit of recursion actually as sweet as it seems? What if we write another method like this called “TakeShoesOff”? What if we write a class called ShoeSalesman that retrieves pairs of shoes? We almost always want him retrieving both shoes for clients, so we’re probably going to want to perpetuate this pattern into his methods as well, probably by copy and paste to save time. How about a ShoeStoreCashier ringing up pairs of shoes? We can take care of that with our good buddy copy and paste too. There’s no method about shoes that we can’t handle that way!

But, wait a second? Isn’t this starting to be kind of a code smell? If this implementation is so sweet, why does it seem all wet in the face of DRY? If you have actually done this in hundreds or thousands of places, you should be getting a sinking feeling in your stomach right about now. That feeling is the feeling that your code is suddenly and subtly incredibly brittle. This cute little encoding over the enum is actually an algorithm that is everywhere in your code.

Let’s say that I want to get in on some cuteness myself. What I’d like to do is write a method and say, “I don’t care which side you pass me, so long as a side exists — I’ll just perform an operation on the first side that I have available, if any.”

I’ve now extended the cuteness to be more “flexible”, and I’m pretty pleased with myself, so I go ahead and deliver this code. No unit tests break, no problems emerge that I can immediately see, so life is good. But then, weird problems start to crop up after a while. People file bug reports saying that they add shoes and see nothing on the screen or that they remove shoes but nothing is deleted from the database. It’s kind of a mystery at first, but I’m forced to get to the bottom of this as the trickle of bug reports starts becoming an avalanche.

What’s going on?!? Well, what’s going on is that all of the Cute 1.0 code can’t handle the new enum I’ve defined for Cute 2.0. So, what does it do? Well, sometimes it skips it altogether and no ops. Sometimes it adds a new key to the dictionary — a key for which it never checks. Sometimes it throws some kind of exception where it falls into a “default” state in a switch statement someone has defined. Sometimes it throws up an error message box informing the user, “This should never happen — email Erik!” for the same reason, which is doubly bad since I’m probably going to be featured on the Daily WTF.

Wow, bummer. But, no big deal. I’ll just roll up my sleeves and upgrade Cute 1.0 to Cute 2.0 everywhere. How many can there be, right? Uh oh. Hope I’m not doing anything this weekend. For everything copy and paste programming lacks in being a good idea, it makes up for in its ability to spread through a code base like kudzu. It’s too late to revert Cute 2.0 since I now have clients that depend on it, so I’d better roll up my sleeves and get pastin’ because it’s going to be a long Saturday with some Mountain Dew and 7,400 methods that I need to change.

The Real Problem

It should be fairly obvious that any “pattern” requiring you to add 5 or 10 lines to the beginning of a bunch of methods is actually an anti-pattern, particularly if those lines are similar or identical. And some might stop here and say that getting cute in the first place here was the problem, but I contend that this is just a symptom. To tie this back in with the series, the real problem is one of abstractions.

As I asked earlier, what is “side”? Really, can you tell me? I mean if I have a “Customer” object in my domain, you’d probably say something like “that models a customer of the enterprise for which this application was written” or else maybe you’d just smack me upside the head for asking such an obtuse question. But for “Side” in Cute 1.0? Without using the word “side” in your definition? You might say “well, it represents where we can put a shoe” or “it represents the directions left and right”, and you’d be correct for two of the enums values. You might say “well, it represents a number of places that you can put a shoe” or “its a flag that you need to use to tell your methods how to behave” and you’d be right for the other two values. Hmmm….

I know! It’s a doo-dad you can use to index a hash! That’s probably the most accurate way to describe it because that is unfortunately true of all 4 values. Unfortunate because for two of the values, it makes no absolutely sense to do that — but at least it’s true, so we’re getting somewhere. At the end of the day, though, I think all you can really say of Side is “it’s an enum and it’s up to clients to figure out what to do with it.” And that, my friends, makes it a bad abstraction.

Here are some other enums that would be poor abstractions

If you’re going to use enums, they ought to be values that are mutually exclusive, constitute a complete set, and form a clear abstraction. The first one is obviously nonsense, but the second two are enums where the person writing/extending them is about to get cute. They’re about to take an enum that has a set of mutually exclusive values and reappropriate it to use as a flag to tell their client code how to behave. Think of the code that’s about to be written — things like “if the suit is black, then do it for spades and clubs” and “if the person passed in has favorite dance property set to doesn’t like to dance…”

If you find yourself doing these kinds of cute things ask yourself what you’re really trying to accomplish. In the suit case, wouldn’t it make more sense to parameterize a method so that you could pass in multiple suits for the client code to operate on, instead of hard-coding the iteration? In the dance case, maybe it’s time for dance to be a first class object so that a “FavoriteDance” property can be set to null or a null object.

In C#, enums are already pretty much screaming “hey, there’s something called polymorphism — use that instead of me!” Once you start adding cute, one-off flag encodings to them, you’re really dropping all pretense of enumeration being a suitable abstraction and going for the gusto with fake, procedural polymorphism forced on your clients. Please, for your sake and mine if we later work together, don’t do that. The world doesn’t have a “class reserve” ala oil that may be exhausted someday — make a class. You won’t regret it because you won’t be spending your weekends upgrading The Cute.


TDD Prevents Copy and Paste Programming

Copy, Paste, Fail, Repeat

Today, I was reviewing some code, and I saw a “before” that looked like this:

The first thing that stuck out to me was that I really wanted to stick “var” in there a couple of times because I read the word “classifier” so many time it lost all meaning. The second thing that struck me was the after:

If you’d like to play a game of “find the compiler warning”, by all means, go ahead, but I warn you — the next sentence is a spoiler. In the new code, “i” is never read anywhere, which will generate a warning about unused variables. Annoying, but not the end of the world, and a fairly quick fix.

But, how did the code get like this? I’ve isolated the change in a way that should make this fairly obvious if you think about it for a minute or two. Basically, someone took the first method, copy and pasted it, and modified the payload of the foreach to suit his or her needs. Of course, the new need didn’t involve reading the local index variable, so oops.

I talked with the developer whose code change it was, doing my due diligence in pointing out that this is a classic example of one of the many problems with copy/paste programming. I tried to think back to a time when I’d been burned by this sort of thing recently, when it dawned on me that this simply wouldn’t happen to me. I say that not because I’m some kind of purist that disables ctrl-V and never pastes any code anywhere (though I do keep this to a pretty strict minimum), but rather because this is simply a non-starter for someone who practices TDD.

TDD Is Full of Win

If I were going to write the method RefreshClassifierStatus, the first test that I would write is that Classifier[0].Enabled was false when I passed in false. That would go red, and then I’d write some obtuse code that set Classifier[0].Enabled to false. Next, I’d write a test that said it was true when I passed in true and, after failing, I’d set that property to the passed in value. Finally, I’d write a test that Classifier[Classifier.Count – 1].Enabled was true when true was passed in and suddenly I’d have a foreach loop executing properly.

As an aside, at this point, I’d set the list to null and write a test for how the method should behave when that happens. This seems not to have been considered in this code, which is another problem with copy/paste programming — it’s a multiplier for existing mistakes.

So where in this, exactly, do I introduce an unused variable? That’s never going to help me get any test to go green. Where in this do I ever copy and paste the method wholesale? That’s not the simplest thing that I can do. So, with these two guiding heuristics, followed rather strictly, it’s simply impossible for me to introduce needless noise into the code. TDD forces an impressive level of efficiency most of the time.

TDD is Full of Fast

Had I written this method, I would have done it in probably 5 minutes or so (and that’s a fairly pessimistic estimate, I think), generating 4 unit tests for my trouble. The person that wrote this, I’m guessing, spent 30 seconds or less doing it. So, I’m behind by 4.5 minutes. But, let’s add to that the time that this developer will spend after seeing the code review firing back up the development environment, navigating to the code, removing the line of code, re-compiling, and re-running to make sure that it’s still okay. I’m now only maybe 2 minutes behind.

Now, let’s also factor in that I just realized that problem about the null check, which I’m going to email over and suggest be fixed in the new and old methods. Now there are two test cases instead of 1, and I’m suddenly ahead in terms of development time.

Okay, so the time here is a bit contrived and it’s not as though mistakes can’t be made with TDD, but I can say that a lot fewer of them are. The point here is that copy/paste programming is supposed to be super fast. It’s what you do when there’s no time for good software development practice because you need to hurry. And yet, it’s really not that fast in the grand scheme of things. It just makes you busier and wrong more efficiently than being wrong by hand.

So, save yourself time, effort, and the headache of getting a reputation for sloppy work. Slow down and do it right. TDD isn’t required for this, but it sure helps.

Acknowledgements | Contact | About | Social Media