I imagine that inversion of control is a relatively popular concept to talk or blog about, particularly in object-oriented circles, so rather than do a garden-variety explanation of the term followed by a pitch for using it, I thought I’d take a slightly different approach. I’m going to talk about the reason that there is often resistance to the concept of inversion of control, why that resistance is understandable, and why that understandable resistance is misguided. This involves a foray into the history of programming and general concepts of human reasoning.
In the Beginning
Nobody starts out writing object oriented programs, nor do they start out understanding concepts like lambda expressions, function evaluation, etc. People start out, almost invariably, with the equivalent of batch scripts. That is, they learn how to automate small procedures and they build from there. This is a natural and understandable progression in terms of individual learning and also in terms of our journey as a programming collective. The earliest programs were sequences of instructions. They had abbreviated syntax, control structures, and goto statements that automated a task from beginning to end.
An example is something like the following (in pseudo code):
file = "numbers.txt"
if not file exists
x = ""
open "numbers.txt" > x
The logic is simple and easy enough to follow. Look for a file called numbers.txt and create it if it doesn’t exist. Otherwise, read it. Now, as you want to add things to this program, it gets longer and probably harder to follow. There may be some notion of looping, handled with a loop construct, or, if sufficiently primitive in terms of time or level of code (i.e. if we’re at the chip level), with goto statements to process the contents of the file.
Procedural Code to the Rescue
As things evolved, the notion of subroutines was introduced to help alleviate complexity and make programs more readable and the concept of procedural or structural programming was born. I believe Dijkstra famously declared that the evolution of this paradigm should make it such that the goto statement was never used again. Structural/procedural programming involves creating abstractions of commonly used routines so that they can be reused and so that the program is more readable and less error prone.
int main(int argc, char* argv)
file = get_filename();
First off, pardon my C syntax. I did not compile this and haven’t written actual C code in a while. But, you get the idea. Here we have an implementation where details are mercifully abstracted away into functions. We want the name of the file, so we call “get_filename()” and let someone else handle it. We want to know if the file exists, so we abstract that as well. Same goes for creating or reading the file. The main routine is much more legible, and, better yet, other people can also call these methods, so you don’t need to copy and paste code or fix errors in multiple places if there are problems.
Procedural programming is powerful, and it can be used to produce very clean code and readable code. Many of those who use it do just that. (Though many also don’t and pride themselves instead on packing the most conceptual C functionality into a single line of hidden pointer arithmetic, tertiary operators and assignments in control structures, but I digress.) And because of its power and long history of success, it imprinted itself very clearly on the minds of people who used it for years and got used to its idioms and nuances.
Let’s think about how a procedural programmer tends to reason about code. That is, there is some main function, and that main function calls a sub-routine/function to handle some of its workload. It delegates to a more abstract function to handle things. Unlike the first example, as procedural code grows, it doesn’t get longer and harder to read. Instead, it grows into more files and libraries. The functions called by main are given their own functions to refer to, and the structure of the program grows like a tree, rather than a beanstalk to the heavens. Main is the root, and it branches out to the eventual leaves which are library functions.
Another way to think of this is command and control. Main is like the White House. It makes all of the big decisions and it delegates to the Cabinet for smaller but still important things. Each Cabinet member has his or her own department and makes all of the big decisions, delegating to underlings the details of smaller picture operations. This continues all the way down the chain of government until someone somewhere is telling someone how much bleach to use when cleaning the DMV floor. The President doesn’t care about this. It’s an inconsequential detail that should be handled without his intervention. Such is the structure of the procedural program as well. It mirrors life in that it mirrors a particular metaphor for accomplishing tasks – the command and control method of delegation.
The reason I go into all of this detail is that I want you to get inside the mind of someone who may be resistant to the concept of inversion of control. If you’re used to procedural programming and the command and control metaphor, then you’re probably nodding along with me. If you’re a dyed-in-the-wool OO programmer who uses Spring framework or some other IOC container, you’re probably getting ready to shout that your code isn’t the US government. That’s fine. We’ll get to that. But for now, think about your procedural-accustomed peer and realize that what you’re suggesting to him or her seems like the equivalent of asking the President of the US to run out to buy bleach for the guy at the DMV to clean the floor. It’s preposterous without the proper framing.
A New Way of Thinking
So, what is the proper framing? Well, after procedural code was well-established, the idea of object-oriented programming came into existence. On its face, this was a weird experiment, and there was no shortage of people that saw this as a passing fad. OOP flew completely in the face of established practice and, frankly, common sense. Instead of having procedures that delegated, we would now have objects that encapsulated properties and functionality. It sounds completely reasonable now, but this was a relatively revolutionary concept.
In the early stages of OOP, people did some things that were silly in retrospect. People got object-happy and in a rush to the latest, greatest thing, created objects for everything. No doubt there were for loop and while loop objects and someone had something like Conditional.If(x == 5).Then().Return(x); On the opposite end of the spectrum, there were some people who had been writing great software with procedural code for 20 years and they weren’t about to stop now, thank-you-very-much. And C++, the most popular early OOP language, put out places at the table for both camps. C++ allowed C programmers to write C and compile it with the C++ compiler, while it allowed OOP fanatics to pursue their weird idioms before eventually settling down into a good rhythm. The publication of books about patterns and anti-patterns helped OOP fans continue their progress.
As these groups coexisted, the early-adopters blazed a trail and the late-adopters grudgingly adopted when they had to. The problem was in how they went about adopting. To a lot of people in this category, a “class” was roughly equivalent to a procedural library file with a group of associated functions. And a class can certainly serve this purpose, despite the point of OOP being completely missed. Anybody who has seen classes with names like “FileUtils” or “FinancialConversions” knows what I’m talking about. These are the calling cards of procedural programmers ordered to write in an object-oriented language without real introduction to object-oriented concepts.
So what? Well, the end-game here is that this OOP/procedural hybrid is firmly entrenched in countless legacy applications and even ones being developed today by people who learned procedural thinking and never really had that “Aha!” moment with object-oriented thinking. It isn’t simply that classes in these applications function as repositories for loosely related methods, but that the entire structure of the program follows the command and control metaphor in an object-oriented world.
And, what is the object-oriented world? I personally think a good metaphor for it is Legos. If you’re a kid with a bunch of Lego kits and parts at your disposal and you want a scene of a bunch of pirate ships or space ships doing battle, you build all of the little components for a ship first. Then, with the little components assembled, you snap them together into larger and larger components until you’ve built your first ship. Sometimes, you prototype a component and use it in multiple places. You then repeat this as necessary and assemble the ships into some grand imitation of adventure on the high seas. This is the fundamental nature of object-oriented programming. There is no concept of delegation in the command and control sense — quite the opposite — the details are assembled first into ever-larger pieces. An equally suitable and less childlike metaphor may be the construction of a building.
As a procedural “President,” you would be ill at ease in this world. You stand there, demanding that non-existent ships assemble themselves by having their hulls assemble themselves by having their internal pieces assemble themselves. You’re yelling a lot, but nothing’s happening and there is, actually, nobody or no thing there to yell at.
Of course, the procedural folks aren’t actually that daft, so what they do instead is force the Lego world to be a command and control world. They lay out the ship’s design and architecture, but they also add a functionality to the ship where it constructs its own internals. That is to say, they start with small stuff like our object-oriented folk, but the small stuff is all designed to give the big things the ability to create the small things themselves. They do this at every level, giving every object extra responsibility and self-awareness so that at the end, they can have a neat, clean soapbox from which to say:
int main(int argc, char* argv)
Ship ship1 = new Ship();
Ship2 = new Ship();
No fuss, no muss (after all the setup overhead of teaching your ships how to build themselves). You simply declare a ship, and tell it to build itself, at which point the ship creates a hull, which it tells to build itself, and so on down the line.
I realize that this sounds insane, probably even to those procedural programmers out there. But it only sounds that way because I’ve changed the metaphor. It made a lot more sense when the President was telling someone to have his people call their people. And, with that metaphor, the object-oriented approach sounded insane, as we’ve covered with the President buying bleach at the corner store for a janitor to clean a floor somewhere.
Getting it Right
So, back to inversion of control (IOC). IOC only makes sense in the Lego metaphor. If you eventually want to build a pirate ship, you start to think about what you need for a pirate ship. Well, you need a crow’s nest and a hull and a plank, and — well, stop right there. We’re getting ahead of ourselves. Before we can build a ship, we clearly need some other stuff. So, let’s focus on the crow’s nest. Well, we need some pieces of wood and that’s about it, so maybe that’s a good place to start. We’ll think about the properties of a piece of wood and define it. It’s probably got three spatial dimensions and a weight and a type, so we can start there. Once we’ve defined and described our wood, we can figure out how to assemble it into a crow’s nest, and so on.
Object-oriented programming is about creating small components to be assembled into larger components. And, inversion of control follows naturally from there — much more naturally than command and control. And it should follow into your programming.
If you’re creating a presentation tier class that drives some form in your UI, one of the first things that’s going to occur to you is that you need to obtain some data from somewhere. In the procedural world, you would say, “Aha! I need to define some service class that goes out and gets that data so I’ll have my presentation tier class instantiate this service and…” But stop. You don’t want to do that. You want to think in the Lego metaphor. As soon as you need something that you don’t have, it’s time to stop working on that class and move to the class that yours needs (if it doesn’t already exist).
But, before you leave, you want to document for posterity that your presentation tier class is useless without a service to provide data. What better way to do that than to make it impossible to create an instance of your class without passing in the service you need? That puts the handwriting on the wall, in 8000 point font, that your presentation tier class needs a service or it won’t work. Now you can go create the service, or find it if it exists, or ask the guy who works on the service layer to write it.
But where does the class calling yours get the service? Who cares. I say that with a period instead of a question mark because it’s declarative. That isn’t your problem, and it isn’t your presentation tier class’s problem. And the reason for that is that your pirate ship isn’t tasked with building its own hull and crow’s nest. Somebody else builds those things and hands them over for ship construction when they’re completed.
Back to the Code
That was a long and drawn out journey, but I think it’s important to drive home the different metaphors and their implications for the sake of this discussion. Without that, procedural and OOP folks are talking at each other and not understanding one another. If you’re trying to sell IOC to someone who isn’t buying it, you’re much better served understanding their thinking. And if you’re one of those resisting buying it, it’s time to get out your wallet because the debate is over. IOC produces cleaner code that is easier to maintain, test, and extend. Procedural coding has its uses, but if you’re already using an OO language, you don’t have a candidate for those uses.
So, what are the actual coding implications of inversion of control? I’ll enumerate some here to finish up.
1. Classes have fewer reasons to change
One of the guiding principles of clean code and keystone member of SOLID is the “Single Responsibility Principle.” Your classes should have only one reason to change because this makes them easier to reason about, and it makes changes to the code base produce less violence and upheaval that triggers regressions. If you use the procedural style to create classes, your classes will always have at least two reasons to change: (1) their actual function changes; and (2) the way they create their sub-components changes.
2. Classes are easier to unit test
If you’re looking to unit test a command and control pirate ship, think about everything that happens in the PirateShip’s constructor. It news up a hull, crow’s nest, etc, which all new up their internals, and so on, recursively down the structure of the application. You cannot unit test PirateShip at all. It encapsulates all of the functionality of your program. In fact, you can’t unit test anything except the tree leaves of functionality. Pretty much all of your tests are system/integration tests. If you invert control, it’s easy. You just pass in whatever you want to the class to poke it and prod it and see how it behaves.
3. No global variables or giant method signatures
Imagine that your crow’s nest wants to communicate with the ship’s rudder for some reason. These classes reside at the complete alpha and omega of the program and are tree leaves. In command and control style, you have two options. The first is to have all of the nodes in between the root and those leaves pass the message along in constructors or mutators. As the amount of overhead resulting from this gets increasingly absurd, most procedural programmers turn to option 2: the global variable (or, its gussied-up object-oriented counterpart loved by procedural programmers everywhere – the singleton). I’ll save that for another post, as it certainly deserves its own treatment in depth, but let’s just say, for argument’s sake and for the time being, that this is undesirable. Every class in the application doesn’t need to see the personal business of those two and how they communicate.
In the IOC model, this is a simple prospect. Because you’re building all of the sub-components and assembling them into increasingly large components, there is one place that has reference to everything. From the class performing all of that assembly, it’s trivial to link those two leaf nodes or, really, any classes. I can give one a reference to the other. I can create a third object to which both refer or give them both references to it. There are any number of options that don’t involve hideous method signatures or globals.
4. Changing the way things are constructed is easy
Speaking of having your assembly all in one place, swapping out components becomes simple. If I want the crow’s nest to use balsa wood instead of pine, I just go to the place that crow’s nest is instantiated and pass it something else. I don’t have to weed through my code base looking for the class and then trace the instantiation path to where I need it. All of the instantiation logic happens centrally. This makes it way easier to introduce conditions for using different types of construction logic that don’t clutter up your classes and that don’t care how their components are constructed. In fact, if you use Spring or some IOC container, you can abstract this out of your program altogether and have it reside in some configuration file, if you’re feeling particularly “enterprisey” (to borrow a term from an amusing site I like to visit).
5. Design by contract becomes easy as well
This is another thing to take at face value for the time being, but having your components interact via interface is much easier this way. Interface programming is a good idea in general. But, if you’re not inverting control, it’s kind of pointless. If all of your object creation is hard-coded throughout the application, interfacing is messy. If your PirateShip is creating a CrowsNest and you want a CrowsNest interface with command and control, you’d have to pass some enumeration (or use some global) into PirateShip to tell it what kind of CrowsNest to instantiate. This, along with some of the other examples, demonstrates the first point about code bloat and the SRP. As I’m introducing these new requirements (which generally happen sooner or later), our procedural classes get bigger, more complicated, and more difficult to maintain. Now they not only need to instantiate their components, but also make decisions about how to instantiate them based on additional variables that you need to put in them. And, I promise, there will be more.
So, I hope some of you reading find this worthwhile. I’m not much of a proselytizer for or true adherent to any one way of doing things. Procedural programming and other styles of programming have their places (I literally just used ‘goto’ today in writing a batch script for executing some command line utilities). But, if you’re in the OO world, using the OO metaphors and surrounded by OO people, it is clearly in your best interest to adapt instead of fight.