DaedTech

Stories about Software

By

Don’t Write Code You Don’t Need

I was reviewing some code the other day, and I saw a quick-and-dirty logger implementation that looked something like this (I’m re-creating from memory and modifying a bit for illustrative purposes in this post):

public class Logger
{
    public string Path { get; set; }

    public Logger()
    {

    }

    public Logger(string path)
    {
        Path = path;
    }

    public void WriteEntry(string entry)
    {
        using (var writer = File.AppendText(Path))
        {
            writer.WriteLine(entry);
        }
    }
}

A few things probably jump out at you, depending on your C#, OOP, and code reviewing chops. I’d imagine that, at the very least, you think, “you could just delete the default constructor.” You might also wonder why the other constructor exists since you could just set Path whenever you wanted. There’s the lack of null/valid check on path before creating a stream writer with it, and the unnecessary, potentially problematic, and definitely not threadsafe creation of the stream writer over and over. These things are all valid to point out, but I’d say that they’re also all symptoms of two larger root causes that I want to talk about.

Overeager to Please

TooMuchCode

You can offer too much code. By this I mean something subtly but critically different from “you can write too much code,” as when you are needlessly complicated or verbose. Offering too much code means that you’re giving users of the public interface of your classes too many options. Before my TDD days, this was something with which I constantly struggled. To understand what I mean, consider the thinking that probably went into the creation of this class:

Well, let’s see. A file needs a path, so a file logger should probably also take a path as input. From there, it should handle the details of streams and all that other stuff so that all we have to do is give it stuff to put in the file.

So far, so good. This is pretty clever as far as abstractions goes, and whoever wrote this class has a good grasp on how to create abstractions that provide value. But here’s what probably came next.

So, for path, let’s make that a public property in case the user of the class wants to change the path later. For convenience, let’s add a constructor that lets the user specify the path, but let’s also let him know that he doesn’t have to.

That thinking is considerate and helpful, but it’s offering too much code. You need to take ownership of your abstractions and be a little forceful and unyielding with them. You provide what you provide, and if users don’t like your class, they can create an inheritor or write their own or something: “this is my class, take it or leave it.”

In this case, you have to make concrete decisions about how and when the path of the logger is set:

In this class, the path is set at instantiation time and only instantiation time. As long as you live with my logger, you will obey my rules!

public class Logger
{
    public string Path { get; private set; }

    public Logger(string path)
    {
        Path = path;
    }

    public void WriteEntry(string entry)
    {
        using (var writer = File.AppendText(Path))
        {
            writer.WriteLine(entry);
        }
    }
}

Notice that we’ve already eliminated the problem of the pointless constructor by making this decision. We’re also well poised to handle the null/valid checking of the path since we can just do it in the constructor instead of doing it in the constructor and in the setter for path and worrying about duplication, when exactly to validate and how, what to do on invalid set when the last path was valid, etc. It’s also going to be very easy to take care of the issue with the needless creation of StreamWriter instances. Now that you can’t change the path once the instance is created, you can simply create the writer in the constructor and reuse it for the lifetime of the object by storing it as a private field. Now making this threadsafe becomes a pretty manageable task. In general, eliminating one conceptual option eliminates a lot of internal implementation complexity.

But what about the users who want to set the path at some point later in the object instance’s lifetime? Psst…that’s not a real common use case. And you know what? If they really want to do that, they can just instantiate a second logger. Don’t make your life really complicated because you think users of your code might want flexibility. Because you know what they want more than extra flexibility? Code that works. And it’s a lot harder to offer them code that works if you’re bending over backwards to handle every conceivable thing they might want to do.

Pointless or Speculative Mutability

The last section touches on this obliquely, but I’d like to bring this to the fore. Mutable, state-based code is a lot more prone to problems than immutable or functional code that has fixed or no state, respectively. Consider the refactoring path I’ve proposed here. In the last section, I chose to eliminate the public setter for Path instead of the constructor that took Path as an argument. Did that seem like a coin flip and pick one to you? Well, it wasn’t.

With the constructor injection of path, we have to do validity checking and initialization only once, to establish preconditions for the object’s existence. With the public setter for path, we have to maintain object invariants, which tends to be more complex. A good example of this is how we handle the transition from a valid path to the user setting an invalid or null path. Do we throw an exception? Revert to the previous path? You don’t have a discussion when there was no valid previous path as is the case with constructor injection.

This is just an example of a broader idea, which is that mutability creates state transition management tasks and that these tasks are harder to get right. To understand what I mean, imagine that you’re implementing the API for an array. You tell your users, “alright, when you create the array, you specify how many elements it will have, and then you can set and access the elements as you please by index.” The users come back and say, “well, we want to change the size of the array on the fly,” to which you respond, “oh, crap,” as you start to wonder what you’ll do if they want to resize it smaller than what they’ve used or if they try to make it negative or too big or something. You can feel the number of edge cases multiplying.

If you come from a background where you write a lot of procedural, script-based, or hacked-together utility code, this may seem like a weird thing to think about. You’re probably used to doing things like declaring boolean ‘flags’ somewhere in a method and setting them much later in that same method in order to keep track of something further down the line. And if you do this, you’re probably used to a lot of tweaking, guessing, and hoping for the best. But as it turns out, you’re doing things the hard way.

The easy way is to program without any state at all. That means no assignment and no variables, the way you might do if asked to write a method that took x and y and returned 4x + 2y. There’s no need to do any of that: just have “return 4x + 2y” and be done with it. This way of doing things is called functional programming, and its calling card is that output is always a pure function of the input and is entirely predictable.

If functional paradigm isn’t an option (and it often isn’t in the entirety of an application), the next easiest thing to deal with is immutable objects. These are objects that have state, but that state is not modifiable after creation time. An example of an object that can easily be immutable is something like an address. An address is just a handful of strings bound together with a class definition, so why have it be mutable? If you want a different one, just let the one you have go out of scope and create a new one. Then you don’t have to worry about questions like “what if I set a new address1, but not a new address2–that would never happen, right?” In immutable land, that’s simply not a consideration that you bother with.

The hardest way of doing things is with state transition management or mutable state. That’s the nature of the original logger at the top. Anything goes. The object has only transitive state, so it has either to constantly manage all permutations of state changes to check for accuracy or it has to risk being wrong in an invalid state. This is not a great choice and should be avoided if at all possible.

As you develop, you’ll find it’s possible a lot more often than you think. Objects that perform a service, such as a logger, generally have little reason to store mutable state, particularly in a nicely decoupled application. Even a lot of data objects, which represent the very core of what we’d think of as mutable, can be immutable a lot more often than you’d think (e.g. address). Mutability in your code is often best left in places where it’s unavoidable. If you don’t have the luxury of pass-through interaction with a database, you will generally maintain an in-memory domain model that needs to have mutable state because it’s representing the database which is, almost by definition, a whole gigantic system of files full of mutable state. A lot of objects that help you manage user interface will have mutable state (e.g. some class that stores things like “is checkbox X currently checked”). But unavoidable as it may be, you can certainly minimize and isolate it. Whatever you do, don’t introduce it where you don’t need it because you’re creating needless complexity, which means extra things that can go wrong.

Take-Away

The upshot of this exhaustive code-review that I’ve conducted is just advice to consider carefully where your classes fall on the functional-immutable-mutable spectrum and to keep them as far toward the functional side as possible. I’m not suggesting that you run out and rewrite existing code this minute or even that you vow to change how you do things. I’m just suggesting you be aware that mutable objects come with a heavy cost in terms of complexity and difficulty. This will help you in your programming travels in general, and especially if you’re looking to delve into things like architecture, test-driven development, or concurrent (multi-threaded) programming.

31 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
trackback

[…] NSA-proof encryption exists. Why doesn’t anyone use it?, Erik Dietrich […]

poiati
poiati
10 years ago

Well done,

Narrow and simple public interfaces is better in most of the cases as well immutability.

Erik Dietrich
10 years ago
Reply to  poiati

I definitely agree with that. I’m a strong proponent of small, focused methods/classes/modules as a complexity management tool.

dotnetchris
10 years ago

YAGNI – You Ain’t Gonna Need It.

Gene Hughson
10 years ago
Reply to  dotnetchris

No offense, but I find “YAGNI” to be incredibly naive – sometimes you _are_ gonna need it. What Erik’s done above goes way beyond that, in that he’s actually looked at the costs and benefits and made a considered design decision rather than a knee jerk.

dotnetchris
10 years ago
Reply to  Gene Hughson

I’d call it 9 times out of 10 that YAGNI always holds true.

Gene Hughson
10 years ago
Reply to  dotnetchris

Fair enough, but in my experience, I’d say 9 out of 10 times it leads to unnecessary rework to address design needs that should have been foreseen. That’s the problem with bumper-sticker philosophy, it fails to take into account differences in context.

SimpleWins
SimpleWins
10 years ago
Reply to  Gene Hughson

I tend to agree with dotnetchris. When “you are gonna need it” if it is hard to add or refactor then that is a smell of the existing code.

Ricardo Rodrigues
Ricardo Rodrigues
10 years ago
Reply to  SimpleWins

Exactly, else you’re falling into the common pitfall of premature optimization.

Gene Hughson
10 years ago
Reply to  SimpleWins

Difficulty in adding or refactoring isn’t the only “gotcha”, something can be relatively easy, but still be costly if it’s pervasive enough. On another forum a commenter gave an example of a team that “doing the simplest thing that could work” chose xml files as their data store. This worked right up into one of the latter sprints where they had to “refactor” their entire persistence layer to accommodate the reporting needs of their customer. Easy enough, but very expensive and avoidable with even a minimum of foresight. I’m certainly not defending gold-plating, but by the same token, you have… Read more »

Erik Dietrich
10 years ago
Reply to  Gene Hughson

I started to weigh in on this discussion of the concept of YAGNI, gold-plating and planning and I keep getting carried away to the point where I don’t think I can do it as a comment. I’m going to see if I can’t organize my thoughts into an upcoming post.

Chris_K
Chris_K
10 years ago

Unfortunately you will get a compile time error by removing the parameterless constructor of the Logger object if you try activating the object generically or inheriting from the Logger class.

Justin
Justin
10 years ago
Reply to  Chris_K

Inheriting will just force you to call the base constructor that takes a string. And your inherited version can provide the parameterless constructor that sets up context-appropriate defaults for default construction in generics. Don’t see any problems here.

Erik Dietrich
10 years ago
Reply to  Chris_K

I s’pose that’d be possible in a mature code base, but this is a class that had just been written and wasn’t in use anywhere. When I removed the constructor, no compiler errors occurred 🙂

Wes68
Wes68
10 years ago

It makes sense. Thanks!

Erik Dietrich
10 years ago
Reply to  Wes68

Glad if it was helpful. 🙂

Dan Sutton
10 years ago

What you’re saying is definitely true in context of the example given, but there are times when I’ve found writing stuff you don’t (yet) need to be very helpful. For example, last year I had to wrap an XMLRPC interface to talk to an online service: the requirements were fairly simple: we were only going to use a couple of the classes and a couple of the methods. However, I wrote a wrapper for all of it — all classes and all methods. Sure enough, six months later, the requirements expanded and we started needing to use more of the… Read more »

Erik Dietrich
10 years ago
Reply to  Dan Sutton

I think that when sizing up my example and formulating this blog post, the subtlety for me was that what we were talking about was essentially programmer A speculating as to what programmers B, C, and D would use later, and doing his best to try to make sure they had everything they might ever want. In other words, this wasn’t future-proofing or really even gold plating, but almost kind of an inter-developer instance of feature creep.

In retrospect, I think I might have been better off changing the title of the post a bit to reflect that.

Dan Sutton
10 years ago
Reply to  Erik Dietrich

Yeah – that makes sense… I think a lot of that sort of thing depends on the level of experience of the programmer: a really experienced one doing something like that usually implies that future programmers really will need whatever it is; take a less experienced one and your point is well received… I’ve seen that sort of thing many times, and it’s never pretty!!

Tomek Kaczanowski
Tomek Kaczanowski
10 years ago

I guess that is why we have TDD which helps to avoid useless speculations but focusing on task at hand.

Erik Dietrich
10 years ago

My adoption of TDD was pretty much the downfall of adding extra stuff to classes. Interestingly, I don’t think it was the YAGNI concept or even the added effort of writing tests for functionality I might not use, but just that TDD alters the way you think about development. Why would I add something that doesn’t satisfy a requirement, since I’m always either satisfying requirements or cleaning up the code?

itoctopus
10 years ago

I can’t think of an open source software that we’re using at the moment that doesn’t have code that is not needed and that is not used.

PS: What’s even worse than adding code that you don’t need is selecting fields that you don’t need. For those who don’t know, a “Select * FROM tablename” will create a huge overhead on MySQL especially if the table has textfields.

Erik Dietrich
10 years ago
Reply to  itoctopus

I think that there’s probably an interesting interpersonal dynamic that goes on with open source projects and the functionality on them. People’s motivations for contributing are going to vary widely (labor of love, tweaking for utility, padding your resume, social coding, etc) and some of those motivations will drive people to write code for code’s sake.

trackback

[…] he said — I really like his post. It reminded me of some discussion that had followed in my post about trying too hard to please with your code where Gene took a nuanced stand against the canned wisdom of “YAGNI.” I […]

trackback

[…] > mutable. I’ve blogged about an idea I called “pointless mutability,” but in general I’ve come to favor immutable constructs over mutable ones whenever […]

trackback

[…] defects and more difficult maintenance. Even when perfectly implemented, this complexity poses the risk of making your code harder to use by its consumers due to the proliferation of options. Where the work meets a real need, as opposed to just a potential one, the costs and benefits can […]

trackback

[…] while back, I wrote a post where I talked about offering too much code and too many options to other potential users of your […]

trackback

[…] them sparingly. It’s nice to offer your clients options, but it’s easy to offer them too many options. Applying the “golden rule” concept from the previous section, imagine if your code […]

ardave
ardave
9 years ago

Just add an .Initialize() method and then you can use the same object over and over without having to create new ones. 🙂

trackback

[…] Well, simply put, the relationship is that inheriting from a collection type just never seems to be the simplest way to get some test passing, and it’s never a compelling refactoring (if I were defining some kind of API library for developers and a custom collection type were a reasonable end-product, I would write one, but not for my own development sake). Inheriting from a collection type is, in fact, a rather speculative piece of coding. You’re defining some type that you want to use and saying “you know, it’d probably be handy if I could also do X,… Read more »

trackback

[…] defects and more difficult maintenance. Even when perfectly implemented, this complexity poses the risk of making your code harder to use by its consumers due to the proliferation of options. Where the work meets a real need, as opposed to just a potential one, the costs and benefits can […]