Stories about Software


I Love Debugger

Learning to Love Bugs

My apologies for the titular pun in reference to “I Love Big Brother” of iconic, Orwellian fame, but I couldn’t resist. The other day, I was chatting with some people about the idea of factoring large methods into smaller, more focused ones and one of the people chimed in with an objection that was genuinely new to me.

Specifically, the objection was that giant methods tended to be preferable because it kept the stack trace flat and made it easier to have everything “all in one place” when you were (inevitably) going through the code in the debugger. My first, fleeting thought was to wonder if people really found it that difficult to ctrl-tab between classes, but I quickly realized that this was hardly the important problem here (and really, to each his or her own). The bigger problem, as I explained a moment later, but have thought through in a bit more detail for a blog post now, is that you’re writing code more likely to generate defects so that when you’re tasked with fixing those defects, you feel more comfortable.

This is like a general housing contractor saying, “I prefer to use sand as a building material over wood or brick for houses I build because it’s much easier to work with in the morning after the tide destroys the house each night.”

Winston realized that two equals four and that the only way to prevent bugs is to cause them. Wilson happily declared, “I love Debugger!”.

More Bugs? Prove It!

So, if you’re a connoisseur of strict logic in debating, you’ll notice that I’ve begged the question here with my objection. That is, I ‘proved’ the reasoning fallacious by assuming that larger methods means more bugs and then used that ‘proof’ as evidence that larger methods should be avoided. Well, fear not. A group of researches from Standford did an empirical analysis of OS bugs, and found:

Figure 5 shows that as functions grow bigger, error rates increase for most checkers. For the Null checker, the largest quartile of functions had an average error rate almost twice as high as the smallest quartile, and for the Block checker the error rate wEis about six times higher for larger functions. Function size is often used as a measure of code complexity, so these results confirm our intuition that more complex code is more error-prone.

Some of our most memorable experiences examining error reports were in large, highly complex functions with contorted control flow. The higher error rate for large functions makes a case for decomposition into smaller, more understandable functions.

This finding is not unique, though it nicely captures the issue. During my time in graduate school in a class on advanced topics in software engineering, we did a unit on the relationship between various coding practices and likelihood of bugs. A consistent theme is that as function size grows, number of defects per line of code grows (in other words, the number of defects per function grows faster than the number of lines per function).

So, What Now?

In the end, my response is quite simply this: get used to a more factored and distributed paradigm. Don’t worry about being lost in files and stack traces in the debugger. Why not? Well, because if you follow Uncle Bob Martin’s advice about factoring methods to be 4 or 5 lines, you wind up with methods that descriptively tell you what they’re going to do and do it perfectly. In other words, you don’t need to step into them because they’re too simple and concise for things to go wrong.

In this fashion, your debugging becomes different. You don’t have a pen and paper, a spreadsheet, a stack trace window, and a row after row of “immediates” all to keep track of what on Earth is going on. You set a breakpoint somewhere, and any method calls are innocent until proven guilty. You step over everything until something fishy happens (or until you become a client of some lumbering beast of a method that someone else wrote, which is virtually assured of having defects). This approach is almost universally rejected at first but infectious with time. I know that, as a “no bigger than the screen” guy originally, my initial reaction to the idea of all methods being 4 or 5 lines was “that’s stupid”. But try it sometime and you won’t go back.

Bye, Bye Debugger!

If you combine small factored methods and unit tests (which tend to have a natural synergy), you will find that your debugger skills begin to atrophy. Rather than reasoning about the code at runtime, you reason about it at compile time. And, that’s a powerful and important concept.

Reasoning about code at run time is programming by coincidence, as made famous by one of my favorite programming books. I mean, think about it — if you need the debugger to understand what the state of the code is and what’s going on, what you’re really saying when you build and run is, “I have no idea what this code is going to do by inspecting it, so I need to run the entire application to understand it.” You don’t understand your own code while you’re writing it. That’s a problem!

When you write small, factored methods and generally tested and decoupled code, you don’t have this problem. Take this to its logical conclusion and imagine a method that takes two int parameters and returns an int representing their sum. Do you need to set breakpoints and watches, tag immediate variables and look at a stack trace to know what this method will do? Of course not! You can reason about this method at compile time and take for granted that it will do the right thing at run time. When you write code like this, you’re telling the application how to behave. When you find yourself immersed in the debugger for three quarters of your day, you’re not dictating how the application will behave. Instead, you’re begging it to work as a kind of prayer since it’s pretty much out of your hands what’s going to happen. Don’t buy it? How many times have you been at your desk with a deadline looming saying “please, just work — why won’t you just work!?!”

This isn’t to say that I never use the debugger. But, with a combination of TDD, a continuous testing tool, and small, factored methods, it’s fairly rare. Generally, I just use it when my stuff is integrated with code not done this way. For my own stuff, if I ever do use it, it’s from the entry point of a unit test and not the application.

The cleaner the code that I write, the more my debugger skills atrophy. I watch in amazement at peers that are incredible with the debugger — and I say that with no irony. Some of them can get it to do things I didn’t realize were possible and that I freely admit are very cool. I don’t know how to do these things because I’m out of practice. But, I consider that good. If you’re getting a lot of practice de-bug-ing your code, it means you’re getting a lot of practice writing code with bugs in it.

So, let’s keep those methods small and get out of the practice of generating bugs.

(By the way, I’m going to be traveling overseas for the next couple of weeks, so this may be my last post for a while).


Preserve Developer Mindshare – Don’t Nitpick


I’m not particularly interested in marketing principles in the commercial sense of the word (though I find the psychology of argumentation and persuasion to be fascinating), so please excuse any failed parallelism in advance. Today, I want to talk about the concept of mind share, but to apply it to the life of a work-a-day developer.

For those not familiar with the concept, mind share is the awareness that a consumer has about a particular product. For instance, if I say “smart phone”, the first things that pop into your head are probably “iPhone”, “Android”, “Blackberry”, perhaps in exactly that order. If that’s the case, iPhone has a larger mindshare from your perspective as a consumer than Blackberry or Android.

Another concept that comes into play is referred to in the linked wikipedia article as “evoked set”. This refers to the set of items that you’ll think of at all without some kind of researching or prompting. In our example above, you didn’t think of Windows Mobile, and now that you read the name, you probably think, “oh yeah, them.” If that’s the case, your evoked set is the first three, and Windows Mobile is out in the cold.

But let’s come back to this later.

A Modest Proposal

The other day, I happened to overhear the substance of a code review. The code was some relatively minor set of changes, and so the suggested fixes and changes were also relatively minor and unremarkable, but with an interesting exception because of its newness to me. The reviewer requested that the developer use the Visual Studio utilities “Sort Usings” and “Organize Usings”. For those not familiar with .NET, this is the Java equivalent of sorting your package imports or the C++ equivalent of sorting/organizing #includes. The only difference is that in C#/.NET, this is functionally useless from the compiled code perspective. That is, C# took a lesson from its counterparts and had its compiler take care of this housekeeping. From a developer’s point of view, this only potentially has ramifications in terms of additional intellisense overhead.

Still, this struck me on the surface as a good practice, albeit one I had never really considered. I suppose that unused using statements are a form of dead code, having intellisense perform better is a mild plus, and sorting them is probably… nice, I guess… for those who ever inspect the using statements. I am not one of those people — I never write them because of Ctrl-Period, and I never look at them. I used to remove them because of the CodeRush issues list for a file, but I tend to turn that off since it tends to unceremoniously remove the Linq extension methods and leave me with non-compiling code (fingers crossed for a fix in a future version).

Back to the story, the reviewer then went on to state that this would be required to ‘pass’ any future code reviews that he did. In spite of the apparent tiny benefit conferred by this practice, something about this proclamation seemed a little off and problematic to me. But, it slipped out of my mind in favor of more pressing matters until I was going through the process of promoting some code in a different scenario the other day, and suddenly, the unfocused nagging issue leaped into full view for me.

Anatomy of a Code Promotion

Generally speaking, a developer’s task is a simple one: implement features, fix any defects. So, if you’re given a task to implement, you implement it, take a moment to pat yourself on the back and move on. And, that’s what I did during my epiphany. Except, er no, wait.

I was using Rational Clear Case, so what I actually had to do was finish the change, and then check in my code. From there, to promote it, I had to open up Clear Case explorer, find my view, right click, and say “Deliver from Stream to Default”. From there, I had to launch Clear Case Project Explorer, find the integration stream, and click “Make a Baseline”. The policy is to name the baseline the default appended with an underscore and my login name. After that, I had to recommend the baseline. Ugh (and double Ugh for Clear Case as source control). Suddenly my life as a developer is not so simple. That’s no longer a one step process, but some number greater than one depending on our standards for granularity.

But wait, crap. I didn’t run all the unit tests to make sure the build wasn’t broken (actually, I tend to be fanatical about that, but I’m making a rhetorical point). I also didn’t run style cop to make sure I was conforming to the set of coding standard we have, nor did I run my other static analysis tools to check for simple mistakes. Alright, so time to do all that, and re-deliver.

But wait. Clear Case forces a rebase operation prior to code delivery (the equivalent of SVN update). And, it’s generally good practice to run all of your tests and analysis tools prior to and after a rebase to make sure that you know whether you are responsible for any broken tests, standards violations, etc or whether you inherited them. Man, this is getting intense.

So alright, promotion process is check your code for correctness, run all tests, run all static analysis. Then, rebase and do all that again. Then, follow that whole rigmarole about delivery and making baselines. My goodness — I haven’t even considered that I might have forgotten to add a file, so I should probably grab a clean copy of everything from source control and rebuild and, if anything breaks, re-deliver. And I haven’t even mentioned the possibility of handling merge conflicts.

Oh, and I now need to sort and organize my using statements. That seemed like a decent idea a few paragraphs ago, but now…

(I realize that there are optimizations that could be made to this particular process — different source control, continuous integration, etc. Point is, just about every process has some warts and, even if it doesn’t, managing concurrent changes and standards in a group environment requires more thought than we realize as we get accustomed to the process.)

Mindshare Revisited

In the face of all of this stuff, the mindshare metaphor begs consideration. If fixing our defects and implementing our features isn’t the iPhone, we’re in big trouble. From there, running unit tests and static analysis tools probably ought to be Android and Blackberry, but they may get pushed out a bit in favor of the particulars of wrangling the source control system and resolving merge conflicts, depending on the source control system and merge tool.

As we add more things, we have two options. We can either reduce the mind share of existing things in our evoked set, or we can spend time and energy expanding our evoked set. So, if we want to hold our efficiency of feature implementation constant, we’re going to have to leave some things out of our mindshare (and then perhaps be reminded of them at code reviews or with exasperated emails from team members with different evoked sets than ours, which we trade for exasperated emails of our own at things missing from their evoked sets). Alternatively, if we want to expand our mindshare, it’s going to come at the cost of a steep learning curve for all newer members and decreased efficiency across the board as we go through our rote checklist prior to each delivery.

Getting It Right

I don’t care for either of these options. So, I have two suggestions for people as the number of sticky notes and strings around our fingers grows in order to promote code.

  1. Don’t sweat the small stuff.
  2. Automate as much as possible.

In the case of “organize and sort usings”, I’d offer item (1). Something that provides no benefit to the end-product and questionable benefit to the development environment is something that ought not to occupy our mindshare as developers. But, in case I am just flat out wrong in my assessment of the benefit/detriment analysis, I’d offer option (2). Given that this is already implemented in Visual Studio, a small plugin running on the build machine could ensure that the using statements in all checked in code are always optimized, without adding to the maze of things developers have to remember.

And, to expand on this, I’d suggest that we in general move as many things into the (2) camp as possible, if we value them. Things like coding standards, static analysis, best practices, etc do matter, so why not force them with automatic, gated checkins or code transforms on the build machine. That ensures they’re always right, and without forcing up front memorization and, more importantly, without distraction from the most important problem — “implement features and fix defects”. The closer to 100% of our mindshare that iPhone occupies, the better for all project stakeholders.


WordPress, Twenty-Ten and Image Resize

I discovered today that image resizing wasn’t working for the blog.  Amazingly, I’d never had an occasion where I cared about resizing an image until just now.  But, when I did, I discovered a frustrating thing.  In the “what you see is what you get” (WYSIWYG) editor for WordPress, the images were appearing correctly and looking good according to my sizing.  But in preview mode, they were rendering at their original size.

After a bit of poking around, I discovered that my style.css was defaulting image height and width to auto.  I removed this (under #content img), and everything was as I would expect.  This seems obvious if you’re just talking about the CSS of some page, but I didn’t think of it off the bat, since the theme’s handling of this seems to run completely counter to wordpress.


Abstractions are Important Part 2 – Good Abstractions

Last time, I talked about the importance of abstractions in relation to a piece of service code that I had seen. This time, I’d like to expand on that concept a bit. I showed some examples of good versus bad abstractions and talked about why they were good or bad, and this time, I’d like to explore the idea of defining in general good versus bad abstractions.

What are Abstractions, Anyway?

If we’re going to talk about abstractions in general rather than simply by example, it probably makes sense to define the term a bit more formally. Wikipedia has a fairly good definition for it:

In computer science, abstraction is the process by which data and programs are defined with a representation similar in form to its meaning (semantics), while hiding away the implementation details.

In other words, an abstraction is a way to say to your clients, “give me the gist of what you want to do and let me worry about the details of how.” If you’re a client of the abstraction, you trust the provider to handle the details correctly.

An Example Abstraction

Abstraction, as a concept, is not limited to the problem domain of programming. Let’s consider an abstraction that has nothing to do with programming, on its face: the pizza shop. The pizza shop abstracts the process of making a pizza from its customers, allowing customers to specify basic properties of their desired pizza while the shop handles more granular ones.

If you call a pizza shop, you tell the shop that you want a large pizza with pepperoni on it. What you typically don’t do is specify how much pepperoni (aside from perhaps in vague terms like “extra” or “light”) nor do you specify the exact dimensions of “large”. These are some of the details that are simplified for you. Others are hidden entirely such as how much basil goes in the marinara sauce or the temperature at which the ovens are set for pizza cooking.

The procedure in general is a simple one. You specify a few rudimentary details about the pizza and whether you want delivery or not, and the shop responds with a time estimate and then, later, a pizza. However, this is an excellent abstraction (as one might surmise by the popularity and ubiquity of its implementation).

So, what makes an abstraction ‘good’? How do we make sure that the ones we’re creating are good?

Exposes Details That Make It Useful

Any abstraction has to expose some level of detail to clients or it would be useless. When calling the pizza place, you are aware, obviously, that you want a pizza. This is unavoidable. On top of that, the pizza place also allows you to specify size and a lot of the ingredients of the pizza. This ensures that you will get as much or as little food as you want and that dietary restrictions and considerations are met. In addition, the pizza place (usually) allows you to specify whether you want to eat there, carry the pizza home or have it delivered. This is another angle for the abstraction as it allows the pizza shop to accommodate your location preference for where you want to eat.

Making the abstraction useful is vital or else nobody would actually use it. In the world of pizza parlors, if one opened up that served only small, mushroom pizzas for carry out, it probably wouldn’t last very long. Even assuming there were no such thing as competition, people would probably opt to make their own pizzas most of the time rather than agree to such specific restrictions. No one would make much use of this abstraction.

Hides details that it needs to control

The flip side of exposing enough detail to make it useful is hiding details that need to be controlled. Imagine the opposite of our “you only get small, mushroom pizzas for carry out” situation, where a pizza parlor allowed specification of any detail, however minute. Customers could call and say that they wanted a pizza cooked in natural cave at 193 degrees Celcius, infused with rare spices from a remote island, and delivered at 4 AM.

The impact on a shop of catering (or attempting to cater) to this level of detail would be disastrous. Imagine the logistics of having to dispatch employees to whatever location customers demanded. Or, imagine the expense incurred in obtaining certain ingredients. And, imagine the absurdity of a pizza place staying open 24/7/365. These things would be the result of too much permissiveness with the abstraction. This abstraction hides no details from its users and, by relinquishing all control over operational details, it allows its users to put it into unprofitable, preposterous modes of operation.

Is Understandable and Intuitive to Clients

If usefulness and guarding against damaging levels of control by clients are table stakes for having an abstraction that can hope to survive, understand-ability and intuitiveness are necessary to thrive. One of the reasons the pizza place is so successful is that it’s a relatively universal, common-sense and simple abstraction. Whatever slight variations may exist, you will generally have no issues ordering pizza from a place even if you’ve never ordered from there before.

“Ask for food”, “add a few ingredients”, “specify where you want the food”, and “pay for food” are all very simple concepts. This simplicity is a big part of the reason that when you’ve checked into a hotel and are tired, you fall back to ordering a pizza instead of ordering, say, tapas or hibachi or going out and buying a bunch of groceries. “How do I get there?”, “Where can I cook this?”, “Do you have a way for me to take this home?”, etc are questions you don’t need to ask. This simplicity and universality makes the abstraction a wildly successful one.

Prevents (or limits) client mistakes

It’s crucial to limit mistakes that clients can force you to make, and it’s almost as crucial to prevent clients from making their own mistakes. The former might blow up your abstraction before it gets going, where the latter, like intuitiveness, is important for gaining adoption. One of the attractions of ordering a pizza is that it’s unlikely to end in disaster. Oh, it might not be cooked to perfection or it might generally be mediocre, but it won’t set your oven on fire or come bubble over into a gigantic mess during cooking.

The reason for this is that the pizza restaurant abstraction removes a lot of the potential problems of cooking a pizza by hiding the process from the customer and leaving it safely in the hands of a specialist. Nothing about specifying the size or the toppings of the pizza gives me the ability to make a decision that somehow causes the pizza to be overcooked or the meat toppings to be dangerously undercooked.

Back to the Code (and are you even making abstractions)?

So, what does all of this mean for coding? I would argue that since a pizza place is really just a process abstraction, we can translate these lessons directly to code. Exposing things to make the abstraction useful while hiding things that would cripple it is fairly straightforward to do, provided that you think in abstractions. I might have a database access abstraction that allows users to specify connection credentials but internally prevents things like multiple connections, dropped connections, etc. In this fashion, I can allow users to connect with different levels of privilege, but I can prevent them from inadvertently getting my class into some invalid state.

Likewise, I can create intuitive operations such as “create new record” or “delete record” that hide ugly details like SQL statements and transactions. This presents an intuitive and inviting interface that’s pleasant to use. And, in addition to providing a minimum guarantee of my abstraction’s own functionality, I can at least assist in saving them from their own lack of familiarity with the abstraction. Fail early goes a long way toward this — I can throw descriptive exceptions if they try to delete nonexistent records, rather than leaving them to decipher what SQL Error 9024B means. This is the equivalent of the pizza place operator saying, “I’m sorry sir, but ordering a negative six inch pizza makes no sense — we don’t offer that.” In real life, this “fail early” approach is much better than a delivery guy showing up empty handed and leaving it to you to figure out why no pizza arrived.

To pull back a bit, I think it’s important to consider the pizza shop or a similar metaphor when writing methods/classes/modules. Don’t simply write code that is technically functional, strewing it willy-nilly about various classes and locations. Don’t write code by coincidence while the debugger is running, setting flags and whatnot to get things working for your exact scenario. Don’t go with the philosophy “ship it if it works.”

Instead, when writing code, imagine that you’re creating a metaphorical pizza place. Who are your ‘customers’? Answer that question, and it becomes easy to answer “what do they want from me?” Answer this question well before “how do I get them what they want?” The “what” is your public API — the useful thing you’re going to provide. The “how” is the detail(s) that you hide from them, for their own good and the operational good of your code. The intuitiveness of your public API is going to be determined by answering questions like “am I logically book-ending operations — if I allow them to open something, do I allow them to close it?” or “if I read this method name and its parameters aloud, is it clear what this does” or “do all parts of my public API do what they say they’re going to do?” If you’re answering yes to these questions, your pizza shop is looking good. If not — if you’re sending sandwiches or sushi when the customers order a medium pepperoni pizza — your abstraction (code) is probably doomed.

And, I hate to bring unit testing into everything, but there’s really no avoiding this — if you write unit tests and especially if you practice TDD, you’re a lot more likely to have better abstractions. Why? Well, you’re your own first customer. This is like going “undercover boss” and ordering pizza from your own shop to see how the experience goes. When you write tests, you’re using your public API, and if you’re muttering things like “what are all these parameters” or “why do I have to pass that in?!?” you’re getting early feedback on your own abstraction. I’ve rarely seen an abstraction that inspired me to react with a “wat?!?” and gone on to find a nice set of unit tests covering it. Tests seem to function as insurance against boneheaded abstractions.

A Checklist For Abstractions

I’ll close out with a set of suggestions — a checklist for evaluating whether you’re creating good, usable abstractions or not.

  1. Does your API operate at a consistent level of abstraction — do you avoid having some methods that require users to pass you SQL statements and others that encapsulate this detail, for example?
  2. Do your methods generally have two or fewer parameters (more parameters making it increasingly hard on users to intuitively understand the method)?
  3. Do your methods have succinct but communicative names like “CreateEntry(Entry entryToCreate)” as opposed to needlessly verbose (“CreateRecordThatIsGoingToGoInTheDatabase(Entry entry)”) names that are hard to type and remember or weirdly succinct names (“CE(Entry e)”)?
  4. Do your methods lie? Does “CreateEntry(Entry entryToCreate)” actually delete an entry, or perhaps less egregiously, create an entry sometimes, unless the entry has a certain flag set true, in which case it quietly fails?
  5. Do you avoid forcing weird details on your clients, such as asking them to store boolean flags?
  6. Do you avoid multiple return values (i.e. out/ref parameters?)
  7. Do you somehow communicate what exceptions your methods might throw?
  8. Do you limit the number of methods per class so that reading through the documentation or IDE assistance is not painful?
  9. Do you avoid forcing your clients to violate the Law of Demeter? That is, in order to get a “D”, do you force your clients to call getA().getB().getC().getD(); ?
  10. Do you practice command query separation in your public API?
  11. Do you limit or eliminate exposing public state, and especially flags?
  12. Do you limit temporal couplings that force your clients to call your methods in a specific order?
  13. Do you avoid deep inheritance hierarchies that make it unclear where the members of the public API actually come from?
  14. Do your publicly exposed classes have a single, obvious responsibility — do you avoid exposing swiss-army-knife classes with a mish-mash of different functionalities?


Abstractions are Important

I was helping someone troubleshoot an issue today, digging through code, and I came across a double-take-inducing design decision. In the GUI, there was a concept of feature, and each feature was being bound to something called FeatureGroup which was a collection of features that, at run-time, only ever contained one feature. So, as a markup-writing client interested in displaying a single feature, I have to bind to the first feature in a group of features that has a size greater than zero and less than or equal to one. This is as opposed to binding to, well, a feature. I’m sure there is some explanation for this, but I don’t want to know what it is. Seriously. I’m not interested.

The reason that I’m not interested is neither frustration, nor is it purism of any kind. I’m not interested because it doesn’t matter what the explanation is. No matter what it is, the reaction by anyone who stumbles across it later is going to be the same:

Everyone who encounters this code is going to have the same reaction I did: “what the…?!? Why?!?” At this point, people may react in various ways. More industrious people would write a new presentation layer abstraction and phase this one out. Others might seek out the original designer and ask an explanation, listening skeptically and resigning themselves to reluctant clienthood. Still others might blindly mimic what’s going on in the surrounding area, programming by coincidence in the hopes of getting things right. But what nobody is going to do is say “yep, this makes sense, and it’s a good, solid base for building further understanding of the application.” And, since that’s the case — since this abstraction won’t make any sense even with some helpful prodding — I don’t want to hear about the design struggles, technology limitations, or whatever else led to this point. It’s only going to desensitize me to a bad abstraction and encourage me to further it later.

Your code is only as good as the abstractions that define it. This is true whether your consumers are end-users, UI designers, or other developers. It doesn’t matter if you’ve come up with the most magical, awesome, efficient or slick internal algorithm if you have a bad outward-facing set of abstractions because people’s reactions will range from avoidance to annoyance, but not appreciation. I’ve touched on this before, tangentially. On the flip side, clients will tend to appreciate an intuitive API, regardless of what it does or doesn’t do under the hood.

My point here isn’t to encourage marketing or salesmanship of one’s own code to other developers, per se, but rather to talk about what makes code “good”. If you are a one-person development team or a hobbyist, this is all moot anyway, and you’re free to get your abstractions wrong until the cows come home, but if you’re not, good abstractions are important. As a developer, ask yourself which you’d rather use (these are not real code, I just made them up):



I don’t think there’s any question as to which you’d rather use. The second one is a mess — I can hear what you’re thinking:

  1. “Connect to what?”
  2. “What in the world is ‘alt’?!?”
  3. “Why do some mutators return nothing and others bool?”
  4. “Why does Close have a boolean to tell it whether or not you want to close — of course you do, or you wouldn’t call Close!”
  5. “Why are there two deletes that require substantially different information — is one better somehow?”
  6. “What does that thing about files do?”
  7. “Why does add want only some fields?”

Notice the core of the objections has to do with abstractions. Respectively:

  1. There is Open() and Close(), but no bookend for Connect(), so it’s a complete mystery what this does and if you should use it.
  2. The second overload of alt adds a mysterious parameter that seems to indicate this overload is some kind of consolation prize method or something, meaning a possible temporal dependency.
  3. There appears to be some ad-hoc mixture of exception and error code error handling.
  4. Close wants a state flag — you need to keep track of this thing’s internal state for it (inappropriate intimacy).
  5. Does this interface want ad-hoc primitives or first class objects? It can’t seem to make up its mind what defines a Customer.
  6. The file stuff makes it seem like this class is a database access class retrofitted awkwardly for a corner case involving files, which is a completely different ballgame.
  7. The rest of the operations have at least one overload that deals with Customer, but Add doesn’t, indicating Add is somehow different than the other CRUD operations

Also, in a broader sense, consider the mixture of layering concepts. This interface sometimes forces you to deal with the database (or file) directly and sometimes lets you deal with business objects. And, in some database operations, it maintains its own state and in some it asks for your help. If you use this API, there is no clear separation of your responsibilities from its responsibilities. You become codependent collaborators in a terrible relationships.

Contrast this with the first interface. The first interface is just basic CRUD operations, dealing only with a business object. There is no concept of database (or any persistence here). As a client of this, you know that you can request Customers and mutate them as you need. All other details (which primitives make up a customer, whether there is a file or a database, whether we’re connected to anything, whether anything is open, etc) are hidden from us. In this API, the separation of responsibilities is extremely clear.

If confronted with both of these API, all things being equal, the choice is obvious. But, I submit that even if the clean API is an abstraction for buggy code and the second API for functional code, you’re still better off with the first one. Why? Simply because the stuff under the hood that’s hidden from you can (and with a clean API like this, probably will) be fixed. What can’t be fixed is the blurring of responsibilities between your code and the confusion at maintenance time. The clean API draws a line in the sand and says “business logic is your deal and persistence is mine.” The second API says, “let’s work closely together about everything from the details of database connections all the way up to business logic, and let’s be so close that nobody knows where I begin and you end.” That may be (creepily) romantic, but it’s not the basis of a healthy relationship.

To wit, the developers using the second API are going to get it wrong because it’s hard to get it right. Fixing bugs in it will turn into whack-a-mole because developers will find weird quirks and work-arounds and start to depend on them. When responsibilities are blurred by mixed, weird, or flat-out-wrong abstractions, problems in the code proliferate like infectious viruses. In short, the clean abstraction API has a natural tendency to improve, and the bad abstraction API has a natural tendency to degenerate.

So please, I beg you, consider your abstractions. Apply a “golden rule” and force onto others only abstractions you’d want forced on yourself. Put a little polish on the parts of your code that others are going to be using. Everyone will benefit from it.