Stories about Software


Are You Changing the Rules of the Road?

Happy Friday, all.  A while back, I announced some changes to the blog, including a partnership with Infragistics, who sponsors me.  Part of my arrangement with them and with a few other outfits (stay tuned for those announcements) is that I now write blog posts for them.  Between writing posts for this blog, writing posts for those blogs, and now writing a book, I’m doing a lot of writing.  So instead of writing Friday’s post late Thursday evening, I’m going to do some work on my book instead and link you to one of my Infragistics posts.

The title is, “Are You Changing the Rules of the Road?”  Please go check it out.  Because they didn’t initially have my headshot and bio, it’s posted under the account “DevToolsGuy,” but it’s clearly me, right down to one of Amanda’s signature drawings there in the post.  I may do this here and there going forward to free up a bit of my time to work on the book.  But wherever the posts reside, they’re still me, and they’re still me writing for the same audience that I always do.






What Story Does Your Code Tell?

I’ve found that as the timeline of my life becomes longer, my capacity for surprise at my situation diminishes. And so my recent combination of types of work and engagements, rather than being strange in any way to me, is simply ammo for genuineness when I offer up the cliche, “variety is the spice of life.” Of late, I’ve been reviewing a lot of code in a coaching capacity as well as creating and giving workshops on story telling and creative writing. And given how much practice I’ve had over the last several years at multi-purposing my work, I’m quite vigilant for opportunities to merge story-telling and software advice. This post is one such opportunity, if a small one.

A little under a year ago, I offered up a post in which I suggested some visualization mnemonics to help make important software design principles more memorable. It was a relatively popular post, so I assume that people found it helpful. And the reason, I believe, that people found it helpful is that stories engage your brain far more than simple conveyance of information. When you read a white-paper explaining the Law of Demeter, the part of your brain that processes natural language activates and decodes the words. But when I tell you a story about a customer in a convenience store removing his pants to pay for a soda, your brain processes this text as if it were experiencing the event. Stories really engage the brain.

One of the most difficult aspects of writing code is to find ways to build abstraction and make your code readable so that others (or you, months later) can read the code as easily as prose. The idea is that code is read far more often than written or modified, so readability is important. But it isn’t just that the code should be readable — it should be understandable and, in some way, even memorable. Usually, understandability is achieved through simplicity and crisp, clear abstractions. Memorability, if achieved at all, is usually created via Principle of Least Surprise. It’s a cheat — your code is memorable not because it captivates the reader, but because the reader knows that mapping what she’s used to will probably work. (Of course, I recognize that atrocious code will be memorable in the vivid, conversational sense, but I’m talking about it being memorable in terms of its function and exact behavior).

It’s therefore worth asking what story your code is telling. Look at this code. What story is it telling?

Read More


What I Learned from Learning about SpecFlow

In my ChessTDD series, I was confronted with the need to create some actual acceptance tests.  Historically, I’d generally done this by writing something like a console application that would exercise the system under test.  But I figured this series was about readers/viewers and me learning alongside one another on a TDD journey through a complicated domain, so why not add just another piece of learning to the mix.  I started watching a Pluralsight course about SpecFlow and flubbing my way through it in episodes of my series.

But as it turns out, I picked up SpecFlow quickly.  Like, really quickly.  As much as I’d like to think that this is because I’m some kind of genius, that’s not the explanation by a long shot.  What’s really going on is a lot more in line with the “Talent is Overrated” philosophy that the deck was stacked in my favor via tons and tons of deliberate practice.

SpecFlow is somewhat intuitive, but not remarkably so.  You create these text files, following a certain kind of format, and they’re easy to read.  And then somehow, through behind the scenes magic, they get tied to these actual code files, and not the “code behind” for the feature file that gets generated and is hard to read.  You tie them to the code files yourself in one of a few different ways.  SpecFlow in general relies a good bit on this magic, and anytime there’s magic involved, relatively inexperienced developers can be thrown easily for loops.  To remind myself of this fact, all I need to do is go back in time 8 years or so to when I was struggling to wrap my head around how Spring and an XML file in the Java world made it so that I never invoked constructors anywhere.  IoC containers were utter black magic to me; how does this thing get instantiated, anyway?!


Read More


Flexibility vs Simplicity? Why Not Both?

Don’t hard code file paths in your application. If you have some log file that it’s writing to or some XML file that it’s reading, there’s a well established pattern for how to keep track of the paths of those files: an external configuration scheme. This might be a .config file or a settings.xml file or even a yourapp.ini file if you’re gray enough in the beard. Or, perhaps it’s something more exotic like a database table or web service that stores key value configuration pairs. Maybe it’s something as simple as command line parameters that specify the path. Whatever the case may be, everyone knows that you don’t hard code — you don’t store the file path right in the source code. That’s amateur hour.

You can imagine how this began. Maybe a long time ago someone said, “hey, let’s log critical application events to a file so that we can review and troubleshoot if things go wrong.” They shipped this for some machine running Windows 3.1 or something, and were logging to C:\temp, which was fine unless users didn’t have a C:\temp directory. In that case, it blew up spectacularly and they were flooded with support calls at which point, they could tell their users to create the directory or they could ship a new set of floppy disks with the new source code, amended to log to a directory guaranteed to exist. Or something like that, anyway.

The lesson couldn’t be more obvious. If they had just thought ahead, they would have realized their choice for the path of the log file, which isn’t even critical anyway, was a poor one. It would have been good if they had chosen better, but it would have been almost as good if they’d just made this configurable somehow so that it needn’t be a disaster. They could have made the path configurable or they could have just made it a configurable option to create C:\temp if it didn’t exist. Next time, they’d do better by building flexibility into the application. They’d create a scheme where the application was flexible and thus the cost of not getting configuration settings right up-front was dramatically reduced.

This approach made sense and it became the norm. User settings and preferences would be treated as data, which would make it easy to create a default experience but to allow users to customize it if they were sufficiently sophisticated. And the predecessor to the “Advanced” menu tab was born. But the other thing that was born was subtle complexity, both for the users and for the programmers. Application configurability is second nature to us now (e.g. the .NET .config file), but make no mistake — it is a source of complexity even if you’re completely used to it. Think of paying $300 per month for all of your different telco concerns — the fact that you’ve been doing this for years does’t mean you’re not shelling out a ton of money.

What’s even more insidious is how this mentality has crept into application development in other ways. Notice that I called this practice “preferences as data” rather than “future-proofing,” but “future-proofing” is the lesson that many took away. If you design your application to be flexible enough, you can insulate yourself against bad initial guesses about user preferences or usage scenarios and you can ensure that the right set of tweaks, configuration alterations, and hacks will allow users to achieve what they want without you needing to re-deploy.

So, what’s the problem, apart from a huge growth in the number of available settings in a config file? I’d argue that the problem is the subtle one of striving for configurability as a first class goal. Rather than express this in general, definition-oriented terms, consider an example that may be the logical conclusion to take this thinking as far as it will go. You have some method that you expose publicly, called, ProcessOrder and it takes a parameter. Contrary to what you might think, the parameter isn’t an order ID and it isn’t an order: it’s an object. Why? Because this API is endlessly flexible. The method signature will suffice even if the entire order processing mechanism changes and even if the structure of the order itself changes. Heck, it won’t need to be altered if you decide to alter ProcessOrder(object order) to send emails. Just pass in an “Email” object and add a check for typeof(Email) to ProcessOrder. Awesome, right?


Yeah, ugh. Flexibility run amok. It’d be easy to interpret my point thus far as “you need to find the balance between inflexibility/simplicity on one end and flexibility/complexity on the other.” But that’s a consultant answer at best, and a non-point at worst. It’s no great revelation that these tradeoffs exist or that it’d be ideal if you could understand which trait was more valuable in a given moment.

The interesting thing here is to consider the original problem — the one we’ve long since file away as settled. We shipped a piece of software with a setting that turned out to be a mistake, so what lesson do we take away from that? The lesson we did take away was that we should make mistakes less costly by adding a configurability out. But what if we made the mistake less costly by making rollouts of the software trivial and inexpensive? Imagine a hypothetical world where rollout didn’t mean shipping a bunch of shrink-wrapped boxes with floppy disks in them but rather a single mouse click and high confidence that everything would go well. If this were a reality, hard-coding a log file path wouldn’t really be a big deal because if that path turned out to be a problem, you could just alter that source code file, click a button, and correct the mistake. By introducing and adjusting a previously unconsidered variable, you’ve managed to achieve both simplicity and flexibility without having to trade one for the other.

The danger for software decision makers comes from creating designs with the goal of satisfying principles or interim goals rather than the goal of solving immediate business problems. For instance, the problem of hard-coding tends to arise from (generally inexperienced) software developers optimizing for their own understanding and making code as procedurally simple as possible — “hardcoding is good because I can see right where the file is going when I look at my code.” That’s not a reasonable business goal for the software. But the same problem occurs with developers automatically creating a config file for application settings — they’re following the principle of “flexibility” rather than thinking of what might make the most sense for their customers or their situation. And, of course, this also applies to the designer of the aforementioned “ProcessOrder(object)” API. Here the goal is “flexibility” rather than something tangible like “our users have expressed an interest in changing the structure of the Order object and we think this is a good idea and want to support them.”

Getting caught up in making your code conform to principles will not only result in potentially suboptimal design decisions — it will also stop you from considering previously unconsidered variables or situations. If you abide the principle “hard-coding is bad,” without ever revisiting it, you’re not likely to consider “what if we just made it not matter by making deployments trivial?” There is nothing wrong with principles; they make it easy to communicate concepts and lay the groundwork for making good decisions. But use them as tools to help you achieve your goals and not as your actual goals. Your goals should always be expressible as humans interacting with your software — not characteristics of the software.


What To Return: IEnumerable or IList?

I’ve received a couple of requests in various media to talk about this subject, with the general theme being “I want to return a bunch of things, so what type of bunch should I use?” I’m using the term “bunch” in sort of a folksy, tongue-in-cheek way, but also for a reason relating to precision — I can’t call it a list, collection or group without evoking specific connotations of what I’d be returning in the C# world (as those things are all type names or closely describe typenames). So, I’m using “bunch” to indicate that you want to return a “possibly-more-than-one.”

I suspect that the impetus for this question arises from something like a curt code review or offhand comment from some developer along the lines of “you should never return a list when you could return an IEnumerable.” The advice lacks nuance for whatever reason and, really, life is full of nuance. So when and where should you use what? Well, the stock consultant answer of “it depends” makes a good bit of sense. You’ll also probably get all kinds of different advice from different people, but I’ll describe how I decide and explain my reasoning.

First Of All, What Are These Things?

Before we go any further, it probably makes sense to describe quickly what each of these possible return values is. IList is probably simpler to describe. It’s a collection (I can use this because it inherits from ICollection) of objects that can be accessed via indexers, iterated over and (usually) rearranged. Some implementations of IList are readonly, others are fixed size, and others are variable size. The most common implementation, List, is basically a dynamic array for the sake of quick, easy understanding.

I’ve blogged about IEnumerable in the past and talked about how this is really a unique concept. Tl;dr version is that IEnumerable is not actually a collection at all (and it does not inherit from ICollection), but rather a combination of an algorithm and a promise. If I return an IEnumerable to you, what I’m really saying is “here’s something that when you ask it for the next element, it will figure out how to get it and then give you the element until you stop asking or there are none left.” In a lot of cases, something with return type IEnumerable will just be a list under the hood, in which case the “strategy” is just to give you the next thing in the list. But in some cases, the IEnumerable will be some kind of lazy loading scheme where each iteration calls a web service, hits a database, or for some reason invokes a 45 second Thread.Sleep. IList is (probably) a data structure; IEnumerable is a algorithm.

Since they’re different, there are cases when one or the other clearly makes sense.

When You’d Clearly Use IEnumerable

Given what I’ve said, IEnumerable (or perhaps IQueryable) is going to be your choice when you want deferred execution (you could theoretically implement IList in a way that provided deferred execution, but in my experience, this would violate the “principle of least surprise” for people working with your code and would be ill-suited since you have to implement the “Count” property). If you’re using Entity Framework or some other database loading scheme, and you want to leave it up the code calling yours when the query gets executed, return IEnumerable. In this fashion, when a client calls the method you’re writing, you can return IEnumerable, build them a query (say with Linq), and say “here, you can have this immediately with incredible performance, and it’s up to you when you actually want to execute this thing and start hammering away at the database with retrieval tasks that may take milliseconds or seconds.”

Another time that you would clearly want IEnumerable is when you want to tell clients of your method, “hey, this is not a data structure you can modify — you can only peek at what’s there. If you want your own thing to modify, make your own by slapping what we give you in a list.” To be less colloquial, you can return IEnumerable when you want to make it clear to consumers of your method that they cannot modify the original source of information. It’s important to understand that if you’re going to advertise this, you should probably exercise care in how the thing you’re returning will behave. What I mean is, don’t return IEnumerable and then give your clients something where they can modify the internal aggregation of the data (meaning, if you return IEnumerable don’t let them reorder their copy of it and have that action also reorder it in the place you’re storing it).

When you’d clearly use IList

By contrast, there are times when IList makes sense, and those are probably easier to understand. If, for instance, your clients want a concrete, tangible, and (generally) modifiable list of items, IList makes sense. If you want to return something with an ordering that matters and give them the ability to change that ordering, then give them a list. If they want to be able to walk the items from front to back and back to front, give them a list. If they want to be able to look up items by their position, give them a list. If they want to be able to add or remove items, give them a list. Any random accesses and you want to provide a list. Clearly, it’s a data structure you can wrap your head around easily — certainly more so than IEnumerable.

Good Polymorphic Practice

With the low hanging fruit out of the way, let’s dive into grayer areas. A rule of thumb that has served me well in OOP is “accept as generic as possible, return as specific as possible.” This is being as cooperative with client code as possible. Imagine if I write a method called “ScareBurglar()” that takes an Animal as argument and invokes the Animal’s “MakeNoise()” method. Now, imagine that instead of taking Animal as the parameter, ScareBurglar took Dog and invoked Dog.MakeNoise(). That works, I suppose, but what if I had a guard-bear? I think the bear could make some pretty scary noises, but I’ve pigeon-holed my clients by being too specific in what I accept. If MakeNoise() is a method on the base class, accept the base class so you can serve as many clients as possible.

On the flip side, it’s good to return very specific types for similar kinds of reasoning. If I have a “GetDog()” method that instantiates and returns a Dog, why pretend that it’s a general Animal? I mean, it’s always going to be a Dog anyway, so why force my clients that are interested in Dog to take an Animal and cast it? I’ve blogged previously about what I think of casting. Be specific. If your clients want it to be an animal, they can just declare the variable to which they’re assigning the return value as Animal.

So, with this rule of thumb in mind, it would suggest that returning lists is a good idea when you’re definitely going to return a list. If your implementation instantiates a list and returns that list, with no possibility of it being anything else, then you might want to return a list. Well, unless…

Understanding the Significance of Interfaces

A counter-consideration here is “am I programming to an interface or in a simple concrete type.” Why does this matter? Well, it can push back on what I mentioned in the last section. If I’m programming a class called “RandomNumberProvider” with a method “GetMeABunchOfNumbers()” that creates a list, adds a bunch of random numbers to it, and returns that list, then I should probably return List<int>. But what if I’m designing an interface called IProvideNumbers? Now there is no concrete implementation — no knowledge that what I’m returning is going to be implemented as List everywhere. I’m defining an abstraction, so perhaps I want to leave my options open. Sure RandomNumberProvider that implements the interface only uses a list. But how do I know I won’t later want a second implementation called “DeferredExecutionNumberProvider” that only pops numbers as they’re iterated by clients?

As a TDD practitioner, I find myself programming to interfaces. A lot. And so, I often find myself thinking, what are the postconditions and abilities I want to guarantee to clients across the board? This isn’t necessarily, itself, a by-product of TDD, but of programming to interfaces. And, with programming to interfaces, specifics can bite you at times. Interfaces are meant to allow flexibility and future-proofing, so getting really detailed in what you supply can tie your hands. If I promise only an IEnumerable, I can later define implementers that do all sorts of interesting things, but if I promise an IList, a lot of that flexibility (such as deferred execution schemes) go out the window.

The Client’s Burden

An interesting way to evaluate some of these tradeoffs is to contemplate what your client’s pain points might be if we guess wrong. Let’s say we go with IEnumerable as a return type but the client really just wants a IList (or even just List). How bad is the client’s burden? Well, if client only wants to access the objects, it can just awkwardly append .ToList() to the end of each call to the method and have exactly what it wants. If the client wants to modify the state of the grouping (e.g. put the items in a different order and have you cooperate), it’s pretty hosed and can’t really use your services. However, that latter case is addressed by my “when a list is a no brainer” section — if your clients want to do that, you need not to give then IEnumerable.

What about the flip side? If the client really wants an IEnumerable and you give them a list? Most likely they want IEnumerable for deferred execution purposes, and you will fail at that. There may be other reasons I’m not thinking of off the top, but it seems that erring when client wants an enumerable is kind of a deal-breaker for your code being useful.

Ugh, so what should I do?!?

Clear as mud? Well, problem is, it’s a complicated subject and I can only offer you my opinion by way of heuristics (unless you want to send me code or gists, and then I can offer concrete opinions and I’m actually happy to do that). At the broadest level, you should ask yourself what your client is going to be doing with the thing that you return and try to accommodate that. At the next broadest level, you should think to yourself, “do I want to provide the client a feature-rich experience at the cost of later flexibility or do I want to provide the client a more sparse set of behavior guarantees so that I can control more implementation details?”

It also pays to think of the things you’re returning in terms of what they should do (or have done to them), rather than what they are. This is the line of thinking that gets you to ask questions like “will clients need to perform random accesses or sorts,” but it lets you go beyond simple heuristics when engaged in design and really get to the heart of things. Think of what needs to be done, and then go looking for the data type that represents the smallest superset of those things (or, write your own, if nothing seems to fit).

I’ll leave off with what I’ve noticed myself doing in my own code. More often than not, when I’m communicating between application layers I tend to use a lot of interfaces and deal a lot in IEnumerable. When I’m implementing code within a layer, particularly the GUI/presentation layer in which ordering is often important, I favor collections and lists. This is especially true if there is no interface seem between the collaborating components. In these scenarios I’m more inclined to follow the “return the most specific thing possible” heuristic rather than the “be flexible in an interface” heuristic.

Another thing that I do is try to minimize the amount of collections that I pass around an application. The most common use case for passing around bunches of things is collections of data transfer objects, such as some method like “GetCustomersWithFirstName(string firstName).” Clearly that’s going to return a bunch of things. But in other places, I try to make aggregation an internal implementation detail to a class. Command-Query Separation helps with this. If I can, I don’t ask you for a collection, do things to it and hand it back. Instead I say “do this to your collection.”

And finally, when in doubt and all else seems to be a toss-up, I tend to favor promising the least (thus favoring future flexibility). So if I really can’t make a compelling case one way or the other for any reason, I’ll just say “you’re getting an IEnumerable because that makes maintenance programming likely to be less painful later.”