Stories about Software


Improving Readability using Moq

While I’ve been switching a lot of my new projects over to using JustMock, as explained here, I still have plenty of existing projects with Moq, and I still like Moq as a tool. One of the things that I have struggled with in Moq that JustMock provides a nice solution to is the cognitive dissonance that arises when thinking about how to reason about your test doubles.

In some situations, such as when you’re injecting the test doubles into your class under test, you want them to appear as if they were just another run of the mill instance of whatever interface/base you’re passing. But at other times you want to configure the mock, and then you want to think of them as actual mock objects. One solution to that is to use Mock.Get() as I described here. Doing that, you can store a reference to the object as if it were any other object but access it using Mock.Get() when you want to do things to its setup:

[gist id=”4998372″]

That’s all well and good, but, perfectionist that I am, I got really tired of all of that Mock.Get() ceremony. It makes lines pointlessly longer, and I think really detracts from the readability of the test classes. So I borrowed an idea from JustMock with its “Helpers” namespace and created a series of extension methods. Some examples are shown here.

[gist id=”4998277″]

These extension methods allow me to alter my test to look like this:

[gist id=”4998488″]

Now there’s no need to switch context, so to speak, between mock object and simple dependency. You always have a simple dependency that just so happens to have some extension methods that you can use for configuration. No need to keep things around as Mock<T> and call the Object property, and no need for all of the Mock.Get().

Of course, there are caveats. This might already exist somewhere and I’m out of the loop. There might be issues with this that I haven’t yet hit, and there is the debugging indirection. And finally, you could theoretically have namespace collisions, though if you’re making methods called “Setup” and “Verify” on your classes that take expression trees and multiple generic parameters, I’d say you have yourself a corner case there, buddy. But all that aside, hopefully you find this useful–or at least a nudge in a direction toward making your own tests a bit more readable or easy to work with.


The Way We Write Code is Stupid: Source Code Files Considered Harmful

Order Doesn’t Matter

Please pardon the loaded phrasing in the title, but that’s how the message came to me from my subconscious brain: bluntly and without ceremony. I was doing a bit of work in Apex, the object-oriented language specific to Salesforce.com, and it occurred to me that I had no idea what idiomatic Apex looked like. (I still don’t.) In C++, the convention (last time I was using it much, anyway) is to first define public members in class headers and then the private members at the bottom. In C#, this is inverted. I’ve seen arguments of all sorts as to which approach is better and why. Declaring them at the top makes sense since the first thing you want to see in the class is what its state will consist of, but declaring the public stuff at the top makes sense since that’s what consumers will interact with and it’s like the above-water part of your code iceberg.

When programming in any of the various programming languages I know, I have this mental cache of what’s preferred in what language. I attempt to ‘speak’ it without an accent. But with Apex, I have no idea what the natives ‘sound’ like, not having seen it in use before. Do I declare instance variables at the bottom or the top? Which is the right way to eat bread: butter side up or butter side down? I started googling to see what the ‘best practice’ was for Apex when the buzzing in my subconscious reached some kind of protesting critical mass and morphed into a loud, clear message: “this is completely stupid.”

I went home for the day at that point–it was late anyway–and wondered what had prompted this visceral objection. I mean, it obviously didn’t matter from a compiled code perspective whether instance variables or public methods come first, but it’s pretty well established and encouraged by people as accomplished and prominent as “Uncle” Bob Martin that consistency of source code layout matters, if not the layout specifics (paraphrased from my memory of his video series on Clean Coders). I get it. You don’t want members of your team writing code that looks completely different from class to class because that creates maintenance headaches and obscures understanding. So what was my problem?

I didn’t know until the next morning in the shower, where I seem to do my most abstract thinking. I didn’t think it was stupid to try to make my Apex code look like ‘standard’ Apex. I thought it was stupid that I needed to do so at all. I thought it was stupid to waste any time thinking about how to order code elements in this file when the only one whose opinion really matters–the compiler–says, “I don’t care.” Your compiler is trying to tell you something. Order doesn’t matter to it, and you shouldn’t care either.

Use Cases: What OOP Developers Want

But the scope of my sudden, towering indignation wasn’t limited to the fact that I shouldn’t have to care about the order of methods and fields. I also shouldn’t have to care about camel or Pascal casing. I shouldn’t have to care about underscores in front of field names or inside of method names. It shouldn’t matter to me if public methods come before private or how much indentation is the right amount of indentation. Should methods be alphabetized or should they be in some other order? I don’t care! I don’t care about any of this.

Let’s get a little more orderly about this. Here are some questions that I ask frequently when I’m writing source code in an OOP language:

  • What is the public API of this type?
  • What private methods are in the ‘tree’ of this public method?
  • What methods of this type mutate or reference this field?
  • What are the types in this namespace?
  • What are the implementations of this interface in this code base?
  • Let’s see this method and any methods that it overrides.
  • What calls this method?

Here are some questions that I never ask out of actual interest when writing source code.  These I either don’t ask at all or ask in exasperation:

  • What’s the next method in this file?
  • How many line feed characters come before the declaration of this variable?
  • Should I use tabs or spaces?
  • In what region is this field’s declaration?
  • Did the author of this file alphabetize anything in it?
  • Does this source file have Windows or *NIX line break characters?
  • Is this a field or a method or what?

With the first set of questions, I ask them because they’re pieces of information that I want while reasoning about code.  With the second set of questions, they’re things I don’t care about.  I view asking these questions as an annoyance or failure.  Do you notice a common pattern?  I certainly do.  All of the questions whose answers interest me are about code constructs and all the ones that I don’t care about have to do with the storage medium for the code: the file.

But there’s more to the equation here than this simple pattern.  Consider the first set of questions again and ask yourself how many of the conventions that we establish and follow are simply ham-fisted attempts to answer them at a glance because the file layout itself is incapable of doing so.  Organizing public and private separately is a work-around to answer the first question, for example.  Regions in C#, games with variable and method naming, “file” vs “type” view, etc. are all attempts to overcome the fact that files are actually really poor communication media for object-oriented concepts.  Even though compilers are an awful lot different now than they were forty years ago, we still cling to the storage medium for source code best suited to those old compilers.

Not Taking our own Advice

If you think of an ‘application’ written in MS Access, what comes to mind?  How about when you open up an ASP web application and find wizard-generated data sources in the markup, or when you open up a desktop application and find SQL queries right in your code behind?  I bet you think “amateurs wrote this.”  You are filled with contempt for the situation–didn’t anyone stop to think about what would happen if data later comes in some different form?  And what about some kind of validation?  And, what the–ugh… the users are just directly looking at the tables and changing the column order and default sorting every time they look at the data.  Is everyone here daft?  Don’t they realize how ridiculous it is to alter the structure of the actual data store every time someone wants a different ordering of the data?

OldYoungAnd you should see some of the crazy work-arounds and process hacks they have in place. They actually have a scheme where the database records the name of everyone who opens up a table and makes any kind of change so that they can go ask that person why they did it.  And–get this–they actually have this big document that says what the order of columns in the table should be.  And–you can’t make this stuff up–they fight about it regularly and passionately.  Can you believe the developers that made this system and the suckers that use it? I mean, how backward are they?

In case you hadn’t followed along with my not-so-subtle parallel, I’m pointing out that we work this way ourselves even as we look with scorn upon developers who foist this sort of thing on users and users who tolerate it.  This is like when you finally see both women in the painting for the first time–it’s so clear that you’ll never un-see it again.  Why do we argue about where to put fields and methods and how to order things in code files when we refuse to write code that sends users directly into databases, compelling them to bicker over the order of column definition in the same?  RDBMS (or any persistence store) is not an appropriate abstraction for an end user–any end user–whether he understands the abstraction or not.  We don’t demand that users fight, decide that there is some ‘right’ way to order invoices to be printed, and then lock the Invoice table in place accordingly for all time and pain of shaming for violations of an eighty-page invoice standard guideline document.  So why do that to ourselves?  When we’re creating object-oriented code, sequential files, and all of the particular orderings, traversings and renderings thereof are wildly inappropriate abstractions for us.

What’s the Alternative?

Frankly, I don’t know exactly what the alternative is yet, but I think it’s going to be a weird and fun ride trying to figure that out.  My initial, rudimentary thoughts on the matter are that we should use some sort of scheme in which the Code DOM is serialized to disk for storage purposes.  In other words, the domain model of code is that there is something called Project, and it has a collection of Namespace.  Namespace has a collection of Type, which may be Interface, Enum, Struct, Class (for C# anyway–for other OOP languages, it’s not hard to make this leap).  Class has one collection each of Field, Method, Property, Event.  The exact details aren’t overly important, but do you see the potential here?  We’re creating a hierarchical model of code that could be expressed in nested object or relational format.

In other words, we’re creating a domain model entirely independent of any persistence strategy.  Can it be stored in files?  Sure. Bob’s your uncle.  You can serialize these things however you want.  And it’ll need to be written to file in some form or another for the happiness of the compiler (at least at first).  But those files handed over to the compiler are output transforms rather than the lifeblood of development.

Think for a minute of us programmers as users of a system with a proper domain, one or more persistence models, and a service layer.  Really, stop and mull that over for a moment.  Now, go back to the use cases I mentioned earlier and think what this could mean.  Here are some properties of this system:

  1. The basic unit of interaction is going to be the method, and you can request methods with arbitrary properties, with any filtering and any ordering.
  2.  What appears on your screen will probably be one or more methods (though this would be extremely flexible).
  3. It’s unlikely that you’d ever be interested in “show me everything in this type.”  Why would you?  The only reason we do this now is that editing text files is what we’re accustomed to doing.
  4. Tracing execution paths through code would be much easier and more visual and schemes that look like Java’s “code bubbles” would be trivial to create and manipulate.
  5. Most arguments over code standards simply disappear as users can configure IDE preferences such as “prepend underscores to all field variables you show me,” “show me everything in camel casing,” and, “always sort results in reverse alphabetical order.”
  6. Arbitrary methods from the same or different types could be grouped together in ad-hoc fashion on the screen for analysis or debugging purposes.
  7. Version/change control could occur at the method or even statement level, allowing expression of “let’s see all changes to this method” or “let’s see who made a change to this namespace” rather than “let’s see who changed this file.”
  8. Relying on IDE plugins to “hop” to places in the code automatically for things like “show all references” goes away in favor of an expressive querying syntax ala NDepend’s “code query language.”
  9. New domain model allows baked-in refactoring concepts and makes operations like “get rid of dead code” easier or trivial, in some cases.

Longer Reaching Impact

If things were to go in this direction, I believe that it would have a profound impact not just on development process but also on the character and quality of object oriented code that is written in general.  The inherently sequential nature of files and the way that people reason about file parsing, I believe, lends to or at least favors the dogged persistence of procedural approaches to object oriented programming (static methods, global state, casting, etc.).  I think that the following trends would take shape:

  1. Smaller methods.  If popping up methods one at a time or in small groups becomes the norm, having to scroll to see and understand a method will become an anomaly, and people will optimize to avoid it.
  2. Less complexity in classes.  With code operations subject to a validation of sorts, it’d be fairly easy to incorporate a setting that warns users if they’re adding the tenth or twentieth or whatever method to a class.  In extreme cases, it could even be disallowed (and not through the honor system or ex post facto at review or check in–you couldn’t do it in the first place).
  3. Better conformance to Single Responsibility Principle (SRP).  Eliminating the natural barrier of “I don’t want to add a new file to source control” makes people less likely awkwardly to wedge methods into classes in which they do not belong.
  4. Better cohesion.  It becomes easy to look for fields hardly used in a type or clusters of use within a type that could be separated easily into multiple types.
  5. Better test coverage.  Not only is this a natural consequence of the other items in this list, but it would also be possible to define “meta-data” to allow linking of code items and tests.

What’s Next?

Well, the first things that I need to establish is that this doesn’t already exist somewhere in the works and that I’m not a complete lunatic malcontent.  I’d like to get some feedback on this idea in general.  The people to whom I’ve explained a bit so far seem to find the concept a bit far-fetched but somewhat intriguing.

I’d say the next step, assuming that this passes the sanity check would be perhaps to draw up a white paper discussing some implementation/design strategies with pros and cons in a bit more detail.  There are certainly threats to validity to be worked out such as the specifics of interaction with the compiler, the necessarily radical change to source control approaches, the performance overhead of performing code transforms instead of just reading a file directly into memory, etc.  But off the top of my head, I view these things more as fascinating challenges than problems.

In parallel, I’d like to invite anyone who is at all interested in this idea to drop me an email or send me a tweet.  If there are others that feel the way I do, I think it’d be really cool to get something up on Github and maybe start brainstorming some initial work tasks or exploratory POCs for feasibility studies.  Also feel free to plus-like-tweet-whatever to others if you think they might be interested.

In conclusion I’ll just say that I feel like I’d really like to see this gain traction and that I’d probably ratchet this right to the top of my side projects list if people are interested (this being a bit large in scope for me alone in my spare time).  Now whenever I find myself editing source files in an IDE I feel like a bit of a barbarian, and I really don’t think we shouldn’t have to tolerate this state of affairs anymore.  Productivity tools designed to hide the file nature of our source code from us help, but they’re band-aids when we need disinfectants, antibiotics, and stitches.  I don’t know about you, but I’m ready to start writing my object-oriented code using an IDE paradigm that doesn’t support GOTO Line as if we were banging out QBasic in 1986.


Nuget: Stop Copying and Pasting your Utils.cs Class Everywhere

I was setting up to give a presentation the other day when it occurred to me that Nuget was the perfect tool for my needs. For a bit of background, I consider it of the utmost importance to tell a story while you present. Things I’m not fond of in presentations include lots of slides without any demonstration and snapshots of code bases in various stages of done. I prefer a presentation where you start with nothing (or with whatever your audience has) and you get to some endpoint, by hook or by crook. And Nuget is an excellent candidate for whichever of “hook” and “crook” means “cheating.” You can get to where you’re going by altering your solution on the fly as you go, but without all of the umming and hawing of “okay, now I’ll right click and add an assembly reference and, oh, darnit, what was that fully qualified path again, I think–oops, no not that.”

In general, I’ve decided to set up an internal Nuget feed for some easier ala carte development, so this all dovetails pretty nicely. It made me stop and think that, given how versatile and helpful a tool this is, I should probably document it for anyone just getting started. I’m going to take you through creating the simplest imaginable Nuget package but with the caveat that this adds a file to your solution (as opposed to other tutorials you’ll see that often involve marshaling massive amounts of assembly references or something). So, here are the steps (on Windows 7 with VS2012):

  1. Go to codeplex and download the package explorer tool.
  2. Install the package explorer, launch it, and choose “Create a New Package”:


  3. In the top left, click the “Edit Metadata” icon as shown here:


  4. Fill out the metadata, ensuring that you fill things in for the bold (required) fields:


  5. Click the green check mark to exit the metadata screen and then right-click in the “Package Contents” pane. Select “Add Content Folder.” The “Content” folder is a structure of files that will mimic what Nuget puts in your solution.


  6. Now, right click on the “Content” folder and select “Add New File.” Name the file “Readme.txt” and add a line of text to it saying “Hello.” Click the save icon and then the “back” arrow next to it.
  7. Now go to the file menu at the top and select “Save,” which will prompt you for a file location. Choose the desktop and keep the default naming of the file with the “nupkg” extension. Close the package explorer.
  8. Create a new solution in Visual Studio into which we’re going to import the Nuget package.
  9. Go to Tools->Library Package Manager->Manage Packages for Solution to launch the Nuget GUI and click “Settings” in the bottom right because we’re going to add the desktop as a Nuget feed:


  10. In this screen, click the plus icon (pointed to by the arrow) and then give the package a name and enter the file path for the desktop by way of configuration:


  11. Click “Ok” and you should see your new Nuget feed listed underneath the official package source. If you click it, you should see your test package there:


  12. Click “Install” and check out what happens:


Observe that source (or text or whatever) files are installed to your solution as if they were MSI installs. You can install them, uninstall them, and update them all using a GUI. Source code is thus turned into a deliverable that you can consume without manual propagation of the file or some kind of project or library include of some “CommonUtils” assembly somewhere. This is a very elegant solution.

It shouldn’t take a ton of extrapolation to see how full of win this is in general, especially if you create a lot of new projects such as a consulting shop or some kind of generalized support department. Rather than a tribal-knowledge mess, you can create an internal Nuget feed and setup a sort of code bazaar where people publish things they think are useful; tag them with helpful descriptors; and document them, allowing coworkers to consume the packages. People are responsible for maintaining and helping with packages that they’ve written. Have two people do the same thing? No worries–let the free market sort it out in terms of which version is more popular. While that may initially seem like a waste, the group is leveraging competition to improve design (which I consider to be an intriguing subject, unlike the commonly embraced anti-pattern, “design by committee,” in which people leverage cooperation to worsen design).

But even absent any broader concerns, why not create a little Nuget feed for yourself, stored in your Dropbox or on your laptop or whatever? At the very least, it’s a handy way to make note of potentially useful things you’ve written, polish them a little, and save them for later. Then, when you need them, they’re a click away instead of a much more imposing “oh, gosh, what was that one thing I did that one time where I had a class kinda-sorta like this one…” away.


API Lost Opportunity: A Case Study

I’m aware that recently there’s been some brouhaha over criticism/trashing of open-source code and the Software Craftsmanship movement, and I’d like to continue my longstanding tradition of learning what I can from things without involving myself in any drama. Toward that end, I’d like to critique without criticizing, or, to put it another way, I’d like to couch some issues I have scan0001with an API as things that could be better rather than going on some kind of rant about what I don’t like. I’m hoping that I can turn the mild frustration I’m having as I use an API into being a better API writer myself. Perhaps you can pick up a few tidbits from this as well.

Recently I’ve been writing some code that consumes the Microsoft CRM API using its SDK. To start off with the good, I’d say that API is fairly powerful and flexible–it gives you access to the guts of what’s in the CRM system without making you write SQL or do any other specific persistence modeling. Another impressive facet is that it can handle the addition of arbitrary schema to the database without any recompiling of client code. This is done, in essence, by using generalized table-like objects in memory to model actual tables. Properties are handled as a collection of key-value string pairs, effectively implementing a dynamic-typing-like scheme. A final piece of niceness is that the API revolves around a service interface which makes it a lot nicer than many things for mocking when you test.

(If you so choose, you can use a code generator to hit your CRM install and spit out strongly-typed objects, but this is an extra feature about which it’s not trivial to find examples or information. It also spits out a file that’s a whopping 100,000+ lines, so caveat emptor.)

But that flexibility comes with a price, and that price is a rather cumbersome API. Here are a couple of places where I believe the API fails to some degree or another and where I think improvement would be possible, even as the flexible scheme and other good points are preserved.

Upside-Down Polymorphism

The API that I’m working with resides around a series of calls with the following basic paradigm:

There is a service that accepts very generic requests as arguments and returns very generic responses as return values. The first component of this is nice (although it makes argument capture during mocking suck, but that’s hardly a serious gripe). You can execute any number of queries and presumably even create your own, provided they supply what the base wants and provided there isn’t some giant switch statement on the other side downcasting these queries (and I think there might be, based on the design).

The second component of the paradigm is, frankly, just awful. The “VeryAbstractBaseResponse” provides absolutely no useful functionality, so you have to downcast it to something actually useful in order to evaluate the response. This is not big-boy polymorphism, and it’s not even adequate basic design. In order to get polymorphism right, the base type would need to be loaded up with methods so that you never need to downcast and can still get all of the child functionality out of it. In order not to foist a bad design on clients, it has to dispense with needing to base your response type on the request type. Here’s what I mean:

Think about what this means. In order to get a meaningful response, I need to keep track of the type of request that I created. Then I have to have a knowledge of what responses match what requests on their side, which is perhaps available in some horrible document somewhere or perhaps is only knowable through trial/error/guess. This system is a missed opportunity for polymorphism even as it pretends otherwise. Polymorphism means that I can ask for something and not care about exactly what kind of something I get back. This scheme perverts that–I ask for something, and not only must I care what kind of something it is, but I also have to guess wildly or fail with runtime exceptions.

The lesson?

Do consumers of your API a favor. Be as general as possible when accepting arguments and as specific as possible when returning them. Think of it this way to help remember. If you have a method that takes type “object” and can actually do something meaningful with it, that method is going to be incredibly useful to anyone and everyone. If you have a method that returns type object, it’s going to be pretty much useless to anyone who doesn’t know how the method works internally. Why? Because they’re going to have to have an inappropriate knowledge of your workings in order to know what this thing is they’re getting back, cast it appropriately, and use it.

Hierarchical Data Transfer Objects

With this interface, the return value polymorphism issues might be livable if not for the fact that these request/response objects are also reminiscent of the worst in OOP, reminiscent of late ’90s, nesting-happy Java. What does a response consist of? Well, a response has a collection of entities and a collection of results, so it’s not immediately clear which path to travel, but if you guess right and query for entities, you’re just getting started. You see, entities isn’t an enumeration–it’s something called “EntityCollection” which brings just about nothing to the table except to provide the name of the entities. Ha ha–you probably wanted the actual entities, you fool. Well, you have to call response.EntityCollection.Entities to get that. And that’s an enumeration, but buried under some custom type called DataCollection whose purpose isn’t immediately clear, but it does implemented IEnumerable so now you can get at your entity with Linq. So, this gets you an entity:

But you’re quite naieve if you thought that would get you anything useful. If you want to know about a property on the entity, you need to look in its Attributes collection which, in a bit of deja vu, is an AttributeCollection. Luckily, this one doesn’t make you pick through it because it inherits from this DataCollection thing, meaning that you can actually get a value by doing this:

D’oh! It’s an object. So after getting your value from the end of that train wreck, you still have to cast it.

The lesson?

Don’t force Law of Demeter violations on your clients. Provide us with a way of getting what we want without picking our way through some throwback, soul-crushing object hierarchy. Would you mail someone a package containing a series of Russian nesting dolls that, when you got to the middle, had only a key to a locker somewhere that the person would have to call and ask you about? The answer is, “only if you hated that person.” So please, don’t do this to consumers of your API. We’re human beings!

Unit Tests and Writing APIs

Not to bang the drum incessantly, but I can’t help but speculate that the authors of this API do not write tests, much less practice TDD. A lot of the pain that clients will endure is brought out immediately when you start trying to test against this API. The tests are quickly littered with casts, as operators, and setup ceremony for building objects with a nesting structure four or five references deep. If I’m writing code, that sort of pain in tests is a signal to me that my design is starting to suck and that I should think of my clients.

I’ll close by saying that it’s all well and good to remember the lessons of this post and probably dozens of others besides, but there’s no better way to evaluate your API than to become a client, which is best done with unit tests. On a micro-level, you’re dog-fooding when you do that. It then becomes easy to figure out if you’re inflicting pain on others, since you’ll be the first to feel it.


Practical Programmer Math: Bit Arithmetic

The “Why This Should Matter to You” Story

Imagine that you’re on a phone interview and things are going pretty well. You’ve fielded some gotcha trivia about whether such and such code would compile and what default parameters are and whatnot. After explaining the ins and outs of the strategy pattern and recursive methods, you’re starting to feel pretty good about yourself when, suddenly, the record skips. “Uh, excuse me?” you stammer.

“Why do integers roll over to negative when you loop all the way up to and past their max values?”

You had always just assumed that this was like an old Nintendo game where, naturally, when you completely leave the right side of the screen you simply appear on the left side. “Uh… they just do…? Pass?”

“No problem–let’s move on. Why do unsigned and signed integers occupy the same number of bits even though signed integers have more information?”

“Oh, they have more information? I know that unsigned integers can’t be negative!”

“Yes, well, okay. Let’s try this: how would you multiply a number by two without using the * operator?”

“Well,” you begin, “I’d–”

“And without adding it to itself.”

“Oh. Huh. I dunno.”

The interviewer sighs a little and says, “okay, so what can you tell me about the way numbers are represented as bits and bytes?”

Remembering the last practical math post, you talk about how you know that bits are 1 or 0 and that 8 of these bits makes up a byte. You talk about how bits can be and-ed and or-ed like booleans, and you offer hopefully that you even know an interesting bit of trivia in that 1K is not actually 1,000 bytes. The interviewer nods absently, and looking back later, you have no trouble pinpointing where things went off the rails. In spite of you knowing a bit about how information is represented at the lowest level, you realize you’re lacking some additional basics as to how this bit and bite stuff works.

You wrap up the interview and when you ask what the next steps are, the interviewer says, “don’t call us–we’ll call you.”

Math Background

First of all, you may want to go back and brush up on some previous posts in this series.

Let’s start with the simplest thing first: bit shifting. To help you understand bit-shifting, I’m going to introduce the concept of “bit significance.” Bit significance is a way of categorizing the “bits” in a binary number from most significant (the “most significant bit” or MSB) to least significant (the “least significant bit” or “LSB”). This is really just a fancy of way of describing the bits that correspond to the largest power of two from greatest to smallest. For example, the number six is written as 0x110, and we would say that the first 1 is the most significant; the middle one the next most; and 0 the least since they correspond to the 4, 2, and 1 values, respectively.

This may seem like a definition so basic as to be pointless, but it’s important to remember that not all architectures necessarily interpret bits from left to right. After all, reading “110” as 6 is rather arbitrary–we could just as easily read it as 3 if we read it the other way. This is reminiscent of how some Middle Eastern written languages are interpreted from right to left. So just to be clear, for the remainder of this post, the MSB will be the leftmost bit and the LSB will be the rightmost.

Bit shifting, often denoted in programming languages by the “>>” and “<<" operators, is the simple procedure of adding a zero to one end or the other of a binary sequence. For example, consider the following code:

What you would wind up with for y is 0x0110 or 6. Notice how there's now an extra 0 on the right and this results in the value being multiplied by two. If we had bit-shifted by 2 instead of 1, we'd have 0x1100 or 12, which is 3 x 4. If you bit-shift the other way, you get back to where you started:

Here, y winds up being 0x0011 or 3 again. Performing a subsequent shift would result in 0x0001 or 1. When you divide unevenly by two, the algorithm 'rounds' down. Another and perhaps easier way to understand this shifting concept is to realize it's actually pretty familiar with the decimal numbers that you're accustomed to. For example, if you "decimal shifted" the number 14 left, you'd have 140 and if you did it to the right, you'd wind up with 1 (rounding down from 1.4). Decimal shifting, as anyone who has "misplaced a zero" knows, is really just a question of multiplying and dividing by powers of 10. Bit-shifting is the same thing, but with powers of two. Now that you understand a bit about bit significance and the concept of shifting, let's tackle the problem of wrap-around and integer overflow. In dealing with binary numbers so far, we've dealt exclusively with positive integers and zero, meaning that we haven't dealt with negative numbers or non-integers like decimals. The latter will be fodder for a future post on floating point arithmetic, but let's consider how to represent a negative number in binary. PhoneInterviewIf it were simply a matter of writing it out, that’d be easy. We could simply write -0x11 for -3, the way we do on paper. But when dealing with actual bits on a computer, we don’t have that luxury. We have to signify negativity somehow using the same ones and zeroes that we use for representing the number. Given that negative/positive is an either/or proposition, the common way to do this is with a bit–specifically, the MSB. This lets you represent negativity, but it necessarily cuts in half the count of numbers that you can represent.

Think of a byte, which is 8 bits. Using all 8 bits, you can represent 2^8 = 256 numbers (0 through 255). But using 7, you can only represent 2^7 = 128 (0 through 127). Think of the language construct “unsigned” in most programming languages (with “signed” being the default). When you’re making this distinction, you’re telling the compiler whether or not you want to sacrifice one of the bits in your number type to use as the +/- sign.

Great, but how do we get to the rollover issue? Well, the reason for that is a little more complicated and it involves a concept called “two’s complement.” Let’s say that we represented binary numbers by taking the MSB and making it 0 for positive and 1 for negative. That’s probably what you were picturing in the last paragraph, and it’s called “sign magnitude” representation. It’s workable, but it has the awkward side effect of giving us two ways to represent 0. For instance, consider a signed 3 bit scheme where the first bit was the sign and the next two the number. 101 and 001 would represent negative and positive 1, respectively, but what about 0x100 and 0x000. Positive and negative zero? It also requires different mechanics for arithmetic operations depending on whether the number is positive or negative, which is computationally inefficient. And finally, the fact that there are two versions of zero means that a potential number that could be represented is wasted.

Instead, processors use a scheme called “two’s complement.” In this scheme, the MSB still does wind up representing the sign, but only by side effect. The positive numbers are represented the way they would be in the binary numbers you’re used to seeing. So, in a 4 bit number, the largest normal number would be 7: 0x0111. But when you flip that last bit and go to what would be 8 as a signed number, 0x1000, you’re suddenly representing -8. If you keep counting, you get pairs of (0x1001 or -7), (0x1010 or -6) … (0x1111 or -1). So you walk the scheme up to the most positive number, which you arrive at halfway through, and then you suddenly hit the most negative number and start working your way back to zero. Adding 1 to 0x0111, which corresponds to Integer.Max for a 4 bit integer, would result in 0x1000, which corresponds to Integer.Min.

Negative and positive that you’re used to operate as a line stretching out either way to infinity with the two concepts colliding at zero. But in a two’s complement scheme, they act as a wheel, and you “roll over.” The reason that two’s complement behaves this way is a bit more involved, and I may cover it in a smaller future post, but suffice it to say that it makes bit arithmetic operations easier to reason about algorithmically and more elegant. It also gets rid of that pesky double zero problem.

How it Helps You

In the short term all of this helps you answer the phone interview questions, but I don’t think of that as too terribly important since I consider trivia questions during the interview process to be a hiring anti-pattern, often indicative of a department rife with petty oneupsmanship and knowledge hoarding. You’re better off without that job, and, now that you know the answers to the questions, you have the best of all worlds.

On a longer timeline, you’re learning the fundamentals of how data is represented and works in its most core form in computer science, and that opens some nice doors, in general. Superficially, knowing to bit-shift instead of multiply or divide may allow you to optimize in some environments (though I believe that with most modern compilers they can do pretty well optimizing this on their own). But at a deeper level, you’re gaining an understanding of the internal workings of the data types that you’re using.

When data is sent “over the wire” and you’re examining it with some kind of dump tool, it’s coming in streams of bits. Sure there’s an endpoint somewhere re-assembling it into things like unsigned ints and floats and even higher level abstractions. But if those are the lowest levels at which you understand data, it’s going to make programs hard to debug when you’re using a tool that shows you raw data. It may be tedious to interpret things at that level, but in a tight spot it can be a life saver.

Further Reading

  1. A pretty nice guide to bit shifting
  2. Detailed explanation of Two’s Complement
  3. A nice video on Two’s Complement
  4. A comparison of different schemes for representing negative in bits