Stories about Software


Subtle Naming and Declaration Violations of DRY

It’s pretty likely that, as a software developer that reads blogs about software, you’ve heard of the DRY Principle. In case you haven’t, DRY stands for “Don’t Repeat Yourself,” and the gist of it is that there should only be one source of a given piece of information in a system. You’re most likely to hear about this principle in the context of copy-and-paste programming or other, similar rookie programming mistakes, but it also crops up in more subtle ways. I’d like to address a few of those today, and, more specifically, I’m going to talk about violations of DRY in naming things while coding.


Type Name in the Instance Name

Since the toppling of the evil Hungarian regime from the world of code notation, people rarely do things like naming arrays of integers “intptr_myArray,” but more subdued forms of this practice still exist. You often see them appear in server-templated markup. For instance, how many codebases contain text box tags with names/ids like “CustomerTextBox?” In my experience, tons.

What’s wrong with this? Well, the same thing that’s wrong with declaring an integer by saying “int customerCountInteger = 6.” In static typing schemes, the compiler can do a fine job of keeping track of data types, and the IDE can do a fine job, at any point, of showing you what type something is. Neither of these things nor anyone maintaining your code needs any help in identifying what the type of the thing in question is. So, you’ve included redundant information to no benefit.

If it comes time to change the data type, things get even worse. The best case scenario is that maintainers do twice the work, diligently changing the type and name of the variable. The worst case scenario is that they change the type only and the name of the variable now actively lies about what it points to. Save your maintenance programmers headaches and try to avoid this sort of thing. If you’re having trouble telling at a glance what the datatype of something is, download a plugin or a productivity tool for your IDE or even write one yourself. There are plenty of options out there without baking redundant and eventually misleading information into your code.

Inheritance Structure Baked Into Type Names

In situations where object inheritance is used, it can be temping to name types according to where they appear in the hierarchy. For instance, you might define a base class named BaseFoo and then a child of that named SpecificFoo, and a child of that named EvenMoreSpecificFoo. So EvenMoreSpecificFoo : SpecificFoo : BaseFoo. But what happens if during a refactor cycle you decide to break the inheritance hierarchy or rework things a bit? Well, there’s a good chance you’re in the same awkward position as the variable renaming in the last section.

Generally you’ll want inheritance schemes to express “is a” relationships. For instance, you might have Sedan : Car : Vehicle as your grandchild : child : parent relationship. Notice that what you don’t have is SedanCarVehicle : CarVehicle : Vehicle. Why would you? Everyone understands these objects and how they relate to one another. If you find yourself needing to remind yourself and maintainers of that relationship, there’s a good chance that you’d be better off using interfaces and composition rather than inheritance.

Obviously there are some exceptions to this concept. A SixCylinderEngine class might reasonably inherit from Engine and you might have a LoggingFooRetrievalService that does nothing but wrap the FooRetrievalService methods with calls to a logger. But it’s definitely worth maintaining an awareness as to whether you’re giving these things the names that you are because those are the best names and/or the extra coupling is appropriate or whether you’re codifying the relationship into the names to demystify your design.

Explicit Typing in C#

This one may spark a bit of outrage, but there’s no denying that the availability of the var keyword creates a situation where having the code “Foo foo = new Foo()” isn’t DRY. If you practice TDD and find yourself doing a lot of directed or exploratory refactoring, explicit typing becomes a real drag. If I want to generalize some type reference to an interface reference, I have to do it and then track down the compiler errors for its declarations. With implicit typing, I can just generalize and keep going.

I do recognize that this is a matter of opinion when it comes to readability and that some developers are haunted by the variant in VB6 or are reminded of dynamic typing in Javascript, but there’s really no arguing that this is technically a needless redundancy. For the readability concern, my advice would be to focus on writing code where you don’t need the crutch of specific type reminders inline. For the bad memories of other languages concern, I’d suggest trying to be more idiomatic with languages that you use.

Including Namespace in Declarations

A thing I’ve seen done from time to time is fully qualifying types as return values, parameters, or locals. This usually seems to occur when some automating framework or IDE goody does it for speculative disambiguation in scoping. (In other words, it doesn’t know what namespaces you’ll have so it fully qualifies the type during code generation to minimize the chance of potential namespace collisions.) What’s wrong with that? You’re preemptively avoiding naming problems and making your dependencies very obvious (one might say beating readers over the head with them).

Well, one (such as me) might argue that you could avoid namespace collisions just as easily with good naming conventions and organization and without a DRY violation in your code. If you’re fully scoping all of your types every time you use them, you’re repeating that information everywhere in a file that you use the type when just once with an include/using/import statement at the top would suffice. What happens if you have some very oft-used type in your database and you decide to move it up a level of namespace? A lot more pain if you’ve needlessly repeated that information all over the place. Perhaps enough extra pain to make you live with a bad organization rather than trying to fix it.

Does It Matter?

These all probably seem fairly nit-picky, and I wouldn’t really dispute it for any given instance of one or even for the practices themselves across a codebase. But practices like these are death by a thousand cuts to the maintainability of a code base. The more you work on fast feedback loops, tight development cycles, and getting yourself in the flow of programming, the more that you notice little things like these serving as the record skip in the music of your programming.

When NCrunch has just gone green, I’m entering the refactor portion of red-green-refactor, and I decide to change the type of a variable or the relationship between two types, you know what I don’t want to do? Stop my thought process related to reasoning about code, start wondering if the names of things are in sync with their types, and then do awkward find-alls in the solution to check and make sure I’m keeping duplicate information consistent. I don’t want to do that because it’s an unwelcome and needless context shift. It wastes my time and makes me less efficient.

You don’t go fast by typing fast. You don’t go fast, as Uncle Bob often points out, by making a mess (i.e. deciding not to write tests and clean code). You really don’t go fast by duplicating things. You go fast by eliminating all of the noise in all forms that stands between you and managing the domain concepts, business logic, and dependencies in your application. Redundant variable name changes, type swapping, and namespace declaring are all voices that contribute to that noise that you want to eliminate.

newest oldest most voted
Notify of
Remon Koopmans

Although I see your point on the TextBox postfix (which I am guilty of using frequently), I was wondering how one should name a label-control pair without using the -Label and -TextBox postfixes. Given your Customer example, I would name them NameLabel and NameTextBox (assuming only a single customer is edited on the form). I have yet to find a better approach so any tips would be appreciated.

Michael Lang
I would guess his answer is that he wouldn’t nitpick every scenario. But it is another tiny cut repeated over and over again. One way to reduce that problem found in Microsoft webforms is to switch over to Microsoft MVC. You don’t name labels, you just output an html element. For conditional hiding you have a compiled statement in the view file. @if (!Model.IsCompanyVisible) { @Html.LabelFor(model=>model.Company): @Html.TextBoxFor(model => model.Company) }
Erik Dietrich
Michael has a pretty good read on my take on any given scenario. I think there’s going to be a lot of variance on a case by case basis. The intent of my post was really to say “these are violations of DRY you probably haven’t considered and might be worth bearing in mind.” One idea for Label-TextBox pairs, and YMMV for whether you like this approach, is to give them names like CustomerDescription and CustomerValue or something else that expresses the “identify field, capture value” relationship. Yet another approach and one that I’ve increasingly come to favor goes along… Read more »
Maybe I’ve spent too long in Javaland, but my initial reaction is of mild disagreement. Any given line of code will be read many more times than it will be modified. So, when in doubt, naming choices should favor ease of readability and not insist that people hover over every variable to figure out what it really is. IDE’s make it trivial to rename CustomerTextBox to something else in the infrequent case where its type is changed. In static languages, I tend to think that things should be declared and named according to the level of abstraction in which they… Read more »
Erik Dietrich
I like the point about reading more than modifying code, and I think it’s certainly valid to weigh the clarity provided by the redundancy against the impediment to refactoring and chance for error in any given situation, considering the tools at your disposal and developer habits. I think it’s also going to depend on the context in which the controls are being named and manipulated — if they’re named and manipulated in code predominantly than I really start to dislike the type in the name, but if they’re named and used exclusively in markup or a dynamic typing scenario, then… Read more »