Please Don’t Recycle Local Variables

I think there’s a lot of value to the conservation angle of the green movement. In general, it’s a matter of efficiency–if you can heat/light/whatever your house with the same quality of life, using less energy and fewer resources, that’s a win for everyone. This applies to a whole lot of things beyond just eco-concerns, however. Conserving heat when you’re cold, conserving energy when you’re running a marathon, conserving your dollars when making a budget–all good ideas. Cut down, conserve, reuse when you can.

Recycle

Except please don’t do it with your local variables. For example:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Here, we initialize a local variable, count, and use it to keep track of the results of some processing of customers. When we’re done, we reset count and use it to keep track of the apparently unrelated concept of machines. What I’m saying is that there shouldn’t be just one count, but rather customerCount and machineCount.

Does this seem like nitpicking? You could certainly make that argument, but this code is not going to age well. First of all, this method should clearly be two methods, so we’re starting right off the bat with a bit of technical debt. It would be cleaner if each loop had its own method.

But an interesting thing happens if we use the refactoring tools to try to do that–the refactoring tool wants return values or input parameters. Yikes, that was unexpected, so we just move on. Later, when the time comes to iterate over movies, we see that there’s a ‘design pattern’ in place, so we modify the code to look like this:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var movie in Movies)
    {
        DoSomeStuffToMovie(movie);
        if (movie.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Now this thing should really be split up, so we start selecting parts of it to see what we can refactor. Ew, now we’re getting ref parameters to boot. This thing is getting even more painful to try to refactor, and we’re in a hurry, so no time for that. And to make matters worse, if you add in a few other aggregator variables this way, you’ll start to have all kinds of barriers in place when you want to pull this thing apart, such as crazy sets of out parameters. I’ve posted before about how I feel about ref and out.

All of this mounting technical debt could easily be avoided by giving each loop its own count variable. Having them recycle the same one creates a compile-time dependency of what’s going on in each loop with what happened in the loop before, even though there are no other similar dependencies in evidence. In other words, recycling this local variable is the only thing that’s creating a coupling in your code–there’s no logical reason to do it.

This is the height of procedural programming and baking in temporal dependencies that I cautioned you to avoid here. It’s a completely useless dependency that will inhibit refactoring and dirty up your code in a hurry. It may not seem like much yet, but this will be a huge pain point later as the lines of code in this method balloon from the dozens to the hundreds, and you rely heavily on automated tools to help with cleanup. Flag variables used over and over in sequence throughout a method are like pebbles in your shoe when you’re trying to refactor.

So my advice is to avoid this practice completely. There’s really no advantage to coding this way and the potential downside is enormous.

  • Steve

    >Ew, now we’re getting ref parameters to boot.

    And that’s the point where the result of the robo-refactor gets the manual clean up to decouple the effectively independent counters, isn’t it?

  • http://www.daedtech.com/blog Erik Dietrich

    For me the opposite in that situation — I clean up and then use the automated refactor, especially if I’m in legacy code that’s not under test. I feel like I’m walking on a tight rope without a safety net in code like that, so I find it critical to automate as much as humanly possible.

  • PeteA

    Nothing wrong with using the same variable – it’s a count. Everything wrong with relying on automated refactoring tools. Definitely bad programming to waste cycles & memory ‘just in case I need it later’. Of course, the real problem is the underlying DRY violation :).

  • Mauro

    I agree with PeteA , I can’t understand whats the point of this article. :S

  • Andrew Forrest

    You’re not wasting memory or cycles. The compiler should reuse the same memory slot for both variables. It’s also not, ‘just because I need it later’; it’s ‘because they’re two separate things’. In fact, reusing a local variable is definitely a (mistaken) case of ‘just because I need it later’.

    In my experience, the ideal is that each variable is assigned once—it makes the code SO much easier to read and to reason about. (Not always possible, of course, but it’s the goal I aim for.)

  • http://www.daedtech.com/blog Erik Dietrich

    I think this will just have to be an “agree to disagree” situation. At a time when memory is measured in *giga* bytes, trying to avoid sacrificing 4 of them for code maintainability seems like a billionaire clipping coupons for rice. (And, as Andrew points out, the performance benefit here is questionable. Eric Lippert, who wrote C# agrees with this take on the lack of benefit: http://stackoverflow.com/questions/7570911/reuse-of-variables ). As for not reusing automated refactoring tools, I can’t get there either. It’s been a decade since one made a mistake on me, and I’ll take automation over manual work every time for efficiency and accuracy.

    I do agree about the DRY violation, though. :)

  • http://www.daedtech.com/blog Erik Dietrich

    Same as any of them that I write. To make people think about and hopefully discuss issues related to programming.

  • Larry Louisiana

    the code is inelegant to begin with.. you should be using linq.. then you can simple use

    Where(x=>x.DidTheRightStuffHappen).Count()

    you should always avoid useless variables…

    using Linq your 30 lines of code can be reduced to 3.

    now that’s an article worth writing..
    “Don’t write 10 lines when one will do”

  • http://www.daedtech.com/blog Erik Dietrich

    I’m aware of the existence of Linq and its extension methods, but three lines of functional code with no local variables doesn’t go a long way toward illustrating my point about reusing local variables. There are plenty of issues with this code — as Pete pointed out earlier the code isn’t DRY, it could probably stand some further abstraction and perhaps polymorphism, since all of the elements share a common property, etc. (As an aside your call to the Where extension method is ignoring the DoStuff methods, that appear to mutate the state of the things, so you’d need an anonymous function that executed those and then returned the boolean property’s value: something like x=> { DoStuff(x); return x.DidTheRightStuffHappen; }).

    But anyway, the mistakes in the code are kind of the point. I encountered some legacy code that had a variety of problems, and I took the opportunity to write about *one* of those problems. This isn’t that code, which was extremely long, filled with domain-specifc jargon, making use of global variables and proprietary. This is a simplified, obfuscated version of it designed specifically to talk about the problematic reuse of the local variables as a barrier to breaking apart the method. You could certainly write posts about Linq and compacting code (and I have on both subjects), but this isn’t either of those posts.

    I’ve actually thought about doing a post on the strange problem of taking legacy code and turning it into an example. It’s sort of a weird art (and one I admittedly could improve in, apparently) because you have to intentionally leave mistakes in to make your point but try to minimize the number of other problems so they don’t detract from your point.

  • sorgfelt

    I think that if you fixed the other problems with the sample code, your problem with reusing variables simply wouldn’t be there. I do like reusing the letter i as an index in small loops, but it is totally initialized and consumed in each loop, and never appears outside of it.

  • http://www.daedtech.com/blog Erik Dietrich

    I definitely agree that improving this code would make the problem irrelevant, and truth be told, it was weird to come up with this code. I’ve gravitated more toward functional programming and I don’t use local variables much at all anymore.

  • unshaven

    If you want to get an idea of good reasons to not re-use local variables for different purposes, read John Carmack’s article about “functional programming in C”.
    She short answer is already in this article: “this code isn’t going to age well”, but the explanations why are IMO lacking.
    There’s everything wrong with re-using variables, believe me.
    Refactoring tools, meh, it’s a thing of preference and what particular tools are used (some suck more, some less)

  • unshaven

    Yes, agree, and if you’re using C++/C, make them CONST to have the compiler remind later maintainers who add something further to the end of a function (alas C# doesn’t allow this for locals, which is unfortunate)

  • Larry Louisiana

    .. i find the key is to make the code look like it needs the mistakes.. . (they aren’t ‘errors’ exactly), That said you should always make an ideal implementation to finish up your article, which will avoid this type of run on commenting.. =)

    — one other thing to keep in mind is that (as mentioned here sorta: http://visualstudiomagazine.com/articles/2013/06/01/roc-rocks.aspx) code is living and those variables may have had a purpose at some point, however over many years of many people “fixing” the code there are remnants of old ideas and thoughts in there… just a useful thing to keep in mind…

  • http://www.daedtech.com/blog Erik Dietrich

    And I think your advice about putting my idea of the finished product at the end of the post is an excellent one. At least that way I’d be getting critiques that would help me, instead of critiques of code I don’t like either. :)

    (The title of that article alone was worth it but I liked the whole thing, by the way — thanks for the link)

  • ChadF

    If a specific variable is used for each, then what will they be named? count1, count2, … ? Or customer_count, machine_count, … ? The second could be ideal, but more effort, so most would probably do the numbered variable. Also, since each section is so alike, it more than likely would be copy/pasted. But if using numbered (or heck, even named) count variables, one might not be changed after the paste. So now there is a logic bug that the compiler doesn’t complain about. If only one counter existed, it couldn’t get mixed up (so is an example of: simpler = less error prone).

    Of course the down side.. if at some point the value in the first counter need to also be used after the second count, it would be trashed by the second loop.

    Also, if it was split into multiple variables, which different names, and then later refactored [properly] into different methods.. now either each method has redundant names (i.e. machine_count when it is the only counter in the machine counting method) or some odd name (i.e. count2, but where is count1? That is confusing). Of course the developer could always rename everything when refactoring, but that makes even more work. I’ve had issues like this where I placed several things together in a single function early on (to get it working), then split them up later to make it cleaner, and had unideal/confusing variable names in the new methods.

    Now this is not to justify reusing a variable for a different purpose.. such as a counter and then a status code, if both happen to be int’s. I would suggest avoiding doing that if practical.

    In the end, if a programmer is going to mess things up due to reusing the same variable for the same general purpose, then they are probably just as likely to mess it up by making the code more complex.

  • http://www.daedtech.com/blog Erik Dietrich

    This attacks the code from a perspective that I wouldn’t. What I mean is, I’m not advocating that anyone write the method like the one in my example or grow it via copy and paste or any other way. It’s more that I’m pleading with people not to recycle local variables so that when I come along and refactor methods like the ones in my example, it’s easier. It’s a selfish post (but designed to make a point about inadvertent coupling in your code).

    The reason I point this out is that I can’t, in good faith, offer suggestions along the lines of your first two paragraphs above. I would never write code where variables differed only by a number (I’d write a loop). I would never copy and paste code the way the example is done and my methods wouldn’t grow to the point where re-using variables made any sense. All of these considerations would be moot if I were writing code or instructing in best practices as I saw them.

    I definitely get your point that the reuse of local variables isn’t the only thing plaguing someone who writes code this way. But if I can offer tips, however incremental, that are followed, I do my best to help a little at a time.

  • Pingback: Intro to Unit Testing 5: Invading Legacy Code in the Name of Testability | DaedTech()