Stories about Software


Practical Programmer Math: Bit Arithmetic

The “Why This Should Matter to You” Story

Imagine that you’re on a phone interview and things are going pretty well. You’ve fielded some gotcha trivia about whether such and such code would compile and what default parameters are and whatnot. After explaining the ins and outs of the strategy pattern and recursive methods, you’re starting to feel pretty good about yourself when, suddenly, the record skips. “Uh, excuse me?” you stammer.

“Why do integers roll over to negative when you loop all the way up to and past their max values?”

You had always just assumed that this was like an old Nintendo game where, naturally, when you completely leave the right side of the screen you simply appear on the left side. “Uh… they just do…? Pass?”

“No problem–let’s move on. Why do unsigned and signed integers occupy the same number of bits even though signed integers have more information?”

“Oh, they have more information? I know that unsigned integers can’t be negative!”

“Yes, well, okay. Let’s try this: how would you multiply a number by two without using the * operator?”

“Well,” you begin, “I’d–”

“And without adding it to itself.”

“Oh. Huh. I dunno.”

The interviewer sighs a little and says, “okay, so what can you tell me about the way numbers are represented as bits and bytes?”

Remembering the last practical math post, you talk about how you know that bits are 1 or 0 and that 8 of these bits makes up a byte. You talk about how bits can be and-ed and or-ed like booleans, and you offer hopefully that you even know an interesting bit of trivia in that 1K is not actually 1,000 bytes. The interviewer nods absently, and looking back later, you have no trouble pinpointing where things went off the rails. In spite of you knowing a bit about how information is represented at the lowest level, you realize you’re lacking some additional basics as to how this bit and bite stuff works.

You wrap up the interview and when you ask what the next steps are, the interviewer says, “don’t call us–we’ll call you.”

Math Background

First of all, you may want to go back and brush up on some previous posts in this series.

Let’s start with the simplest thing first: bit shifting. To help you understand bit-shifting, I’m going to introduce the concept of “bit significance.” Bit significance is a way of categorizing the “bits” in a binary number from most significant (the “most significant bit” or MSB) to least significant (the “least significant bit” or “LSB”). This is really just a fancy of way of describing the bits that correspond to the largest power of two from greatest to smallest. For example, the number six is written as 0x110, and we would say that the first 1 is the most significant; the middle one the next most; and 0 the least since they correspond to the 4, 2, and 1 values, respectively.

This may seem like a definition so basic as to be pointless, but it’s important to remember that not all architectures necessarily interpret bits from left to right. After all, reading “110” as 6 is rather arbitrary–we could just as easily read it as 3 if we read it the other way. This is reminiscent of how some Middle Eastern written languages are interpreted from right to left. So just to be clear, for the remainder of this post, the MSB will be the leftmost bit and the LSB will be the rightmost.

Bit shifting, often denoted in programming languages by the “>>” and “<<" operators, is the simple procedure of adding a zero to one end or the other of a binary sequence. For example, consider the following code:

What you would wind up with for y is 0x0110 or 6. Notice how there's now an extra 0 on the right and this results in the value being multiplied by two. If we had bit-shifted by 2 instead of 1, we'd have 0x1100 or 12, which is 3 x 4. If you bit-shift the other way, you get back to where you started:

Here, y winds up being 0x0011 or 3 again. Performing a subsequent shift would result in 0x0001 or 1. When you divide unevenly by two, the algorithm 'rounds' down. Another and perhaps easier way to understand this shifting concept is to realize it's actually pretty familiar with the decimal numbers that you're accustomed to. For example, if you "decimal shifted" the number 14 left, you'd have 140 and if you did it to the right, you'd wind up with 1 (rounding down from 1.4). Decimal shifting, as anyone who has "misplaced a zero" knows, is really just a question of multiplying and dividing by powers of 10. Bit-shifting is the same thing, but with powers of two. Now that you understand a bit about bit significance and the concept of shifting, let's tackle the problem of wrap-around and integer overflow. In dealing with binary numbers so far, we've dealt exclusively with positive integers and zero, meaning that we haven't dealt with negative numbers or non-integers like decimals. The latter will be fodder for a future post on floating point arithmetic, but let's consider how to represent a negative number in binary. PhoneInterviewIf it were simply a matter of writing it out, that’d be easy. We could simply write -0x11 for -3, the way we do on paper. But when dealing with actual bits on a computer, we don’t have that luxury. We have to signify negativity somehow using the same ones and zeroes that we use for representing the number. Given that negative/positive is an either/or proposition, the common way to do this is with a bit–specifically, the MSB. This lets you represent negativity, but it necessarily cuts in half the count of numbers that you can represent.

Think of a byte, which is 8 bits. Using all 8 bits, you can represent 2^8 = 256 numbers (0 through 255). But using 7, you can only represent 2^7 = 128 (0 through 127). Think of the language construct “unsigned” in most programming languages (with “signed” being the default). When you’re making this distinction, you’re telling the compiler whether or not you want to sacrifice one of the bits in your number type to use as the +/- sign.

Great, but how do we get to the rollover issue? Well, the reason for that is a little more complicated and it involves a concept called “two’s complement.” Let’s say that we represented binary numbers by taking the MSB and making it 0 for positive and 1 for negative. That’s probably what you were picturing in the last paragraph, and it’s called “sign magnitude” representation. It’s workable, but it has the awkward side effect of giving us two ways to represent 0. For instance, consider a signed 3 bit scheme where the first bit was the sign and the next two the number. 101 and 001 would represent negative and positive 1, respectively, but what about 0x100 and 0x000. Positive and negative zero? It also requires different mechanics for arithmetic operations depending on whether the number is positive or negative, which is computationally inefficient. And finally, the fact that there are two versions of zero means that a potential number that could be represented is wasted.

Instead, processors use a scheme called “two’s complement.” In this scheme, the MSB still does wind up representing the sign, but only by side effect. The positive numbers are represented the way they would be in the binary numbers you’re used to seeing. So, in a 4 bit number, the largest normal number would be 7: 0x0111. But when you flip that last bit and go to what would be 8 as a signed number, 0x1000, you’re suddenly representing -8. If you keep counting, you get pairs of (0x1001 or -7), (0x1010 or -6) … (0x1111 or -1). So you walk the scheme up to the most positive number, which you arrive at halfway through, and then you suddenly hit the most negative number and start working your way back to zero. Adding 1 to 0x0111, which corresponds to Integer.Max for a 4 bit integer, would result in 0x1000, which corresponds to Integer.Min.

Negative and positive that you’re used to operate as a line stretching out either way to infinity with the two concepts colliding at zero. But in a two’s complement scheme, they act as a wheel, and you “roll over.” The reason that two’s complement behaves this way is a bit more involved, and I may cover it in a smaller future post, but suffice it to say that it makes bit arithmetic operations easier to reason about algorithmically and more elegant. It also gets rid of that pesky double zero problem.

How it Helps You

In the short term all of this helps you answer the phone interview questions, but I don’t think of that as too terribly important since I consider trivia questions during the interview process to be a hiring anti-pattern, often indicative of a department rife with petty oneupsmanship and knowledge hoarding. You’re better off without that job, and, now that you know the answers to the questions, you have the best of all worlds.

On a longer timeline, you’re learning the fundamentals of how data is represented and works in its most core form in computer science, and that opens some nice doors, in general. Superficially, knowing to bit-shift instead of multiply or divide may allow you to optimize in some environments (though I believe that with most modern compilers they can do pretty well optimizing this on their own). But at a deeper level, you’re gaining an understanding of the internal workings of the data types that you’re using.

When data is sent “over the wire” and you’re examining it with some kind of dump tool, it’s coming in streams of bits. Sure there’s an endpoint somewhere re-assembling it into things like unsigned ints and floats and even higher level abstractions. But if those are the lowest levels at which you understand data, it’s going to make programs hard to debug when you’re using a tool that shows you raw data. It may be tedious to interpret things at that level, but in a tight spot it can be a life saver.

Further Reading

  1. A pretty nice guide to bit shifting
  2. Detailed explanation of Two’s Complement
  3. A nice video on Two’s Complement
  4. A comparison of different schemes for representing negative in bits


Practical Programmer Math: From Booleans to Bits to Bytes

The “Why This Should Matter to You” Story

A few scenarios today:

  1. You don’t understand why the answer to “how many bytes are in a kilobyte” isn’t 1000 (at least, not exactly).
  2. You see if((x & mask) > 7) and wonder why it compiles despite the author forgetting an ‘&’
  3. You feel a little ashamed because in spite of the fact that your non-programmer friends say that you “think in ones and zeroes”, you don’t actually understand how your code becomes ones and zeroes.

Everything here is symptomatic of a lack of understanding of the way data is represented at the most fine grained level in computer science and in programming.
It’s possible to understand some programming concepts without understanding the mechanics of how they work internally. After all, you can understand that a list.sort() method without understanding the sort algorithm being used.

So the general story is that your understanding of the data and data structures that go into programming breaks down at some level. You may understand that a string is a series of characters, but perhaps not that characters are actually integers or that integers are actually bytes or that bytes are actually series of bits or that bits are simply Booleans. And that breakdown in understanding somewhere in the chain leads to situations where you don’t fully understand what your peers are talking about.

Math Background

In the last two posts in this series, I’ve gone through some basic but important points about Boolean algebra, which one might describe as the basic mathematics of truth values. In this system there are two fundamental values to which constants, variables and expressions might evaluate: true and false. Boolean algebra is, in a very real sense, the absolute fundamental building block for everything that we as programmers do, but it may not be entirely obvious as to how.

To get there, let’s leave behind the syntax of Boolean algebra and pick up the syntax of machine language. So forget about “true” and “false” and consider instead 1 and 0, respectively. 1 is true, 0 is false. And further let’s get rid of the conjunction, disjunction and negation operators and consider, in their stead, the more familiar & | ~ operators. (You’ll notice that these are single operators and not the double operators with which you are likely familiar and this is not a typo — there is a method to my madness.) All of the rules from the previous posts still apply so, for instance, the identity law of A | 0 = A applies.

So far we’ve just swapped syntactical operators and constants. But let’s now consider an interesting property of basic arithmetic (not Boolean arithmetic) which is that written numbers are really a sequence of answers to a pre-arranged sequence of questions.


Okay, let me clarify with an example. Let’s pick a three digit number like 481. There’s nothing particularly special about this number, but let’s think of it in terms of a Q&A session. It goes like this “how many hundreds are in this number?” followed by the answer “four.” “How many tens are in this number?” “Eight.” “And how many ones?” “One.” This is really all there is to numerical representation of natural numbers (discounting the concept of zero, negative numbers, decimals, etc).

In all numbers that we’re comfortable with in normal society, the question always involves powers of 10: 1, 10, 100, etc. And so the answer is always a number 0 through 9. But what if we asked the question in a different way? What if we did it in terms of 7? Or 34? Or… 2? The answer, respectively, would always be an answer 0 through 6, 33… and 1. That last one is most interesting to us here. The answer will always be between 0 and 1 or, put another way, the answer will always be 0 or 1. Interesting.

Let’s look at an example. The number 10110, using powers of 2, is asking the question, “how many 16’s, 8’s, 4’s, 2’s and 1’s are there” to which the answers come as “1, 0, 1, 1, 0” respectively. We have 1 sixteen, no eights, 1 four, 1 two and no ones for a total of 22. 10110 is the binary representation of 22. That’s probably not new to you regardless of your CS/math/etc background if you’re a programmer. But what might be new to you is just how fundamental this concept is because it allows translation between physical and “soft”-ware concepts. How so? Because you know what’s actually pretty hard to represent and measure physically? 0 through 9. But you know what isn’t? Booleans. On or off. There or not. Current flowing or nothing. Chad punched or not. Magnetic surface raised or flat. Ion charged or neutral. You get the idea.

We can build computers where information is stored as a series of Booleans representing not “true or false”, but “yes or no” and we can still apply all of the same principles, rules, equivalences and concepts, albeit with slightly different semantics. We can take these “yes or no” values and assemble them into integers using our binary question schemes. We can them assemble those integers into ASCII characters, strings, basic programs, complex programs, multi-media files, and you know the rest.

The math behind all of programming is simple. The basis for the whole thing is simply gigantic sequences of Boolean values assembled into ever-larger pieces. The marriage of Boolean and counting unique to binary encoding means that you know how to assemble the Boolean sequences into meaningful information and also how to manipulate information and store it back, performing operations using identity, domination, negation, and other equivalences (we will get to some interesting optimizations using bit-shifting and other techniques in future posts).

The only missing piece here is how we get from these individual “1 or 0” values, called “bits” to larger values. Well, most of this is arbitrary. For instance, the definition of “byte” is simply “8 consecutive bits,” meaning that a byte can represent one of 256 possible integer values. These different values can also be assigned (again arbitrarily) to non-integral values such as characters. This is the basis for ASCII, a near-universal standard for representing text using integers (and thus binary/bits). This is the mechanism by which the building blocks are assembled.

How it Helps You

In a (hyphenated) word: background-knowledge. For instance, with all of this in mind, it’s relatively easy to get to solutions for the mysteries at the beginning of the post. First up is the weirdness surrounding bit counting using KB, MB, GB, etc. Is a kiloybyte a thousand bytes the way a kilogram is a thousand grams? Well, it could be, if you define it the way we’re used to seeing those terms defined. But that isn’t actually what winds up happening. What happens instead is the result of an interesting quirk of powers of two. As it turns out, 2 to the 10th is 1,024, which is roughly a ‘kilo’ of bytes. 2 to the 20th is 1,048,576, which is roughly a ‘mega’ (1 million) of bytes. Same kind of “fuzzy” counting happens with Giga, Tera, etc, all the way up the chain with 2 to the 30th, 40th, etc. So why reuse a prefix that means something else — something otherwise very precise? Personally, I think the adopters of this convention were being cute, and I don’t care for it at all (*dons flame retardant suit), but them’s the breaks. Apparently, I’m not the first to feel this way and others have tried and are trying to do something about it.

What about all that bit mask stuff? Well that’s now pretty simple to sort out as well. One common convention that you need for a bit of additional perspective is to know that “base-16” is often adopted for representing bytes. The reason is that a byte is 8 bits and each 4 bits represents one of 16 possible values. So if you have a base-16 number, bytes can be written with two values. These take the form of something like 4E because you get 0-9 of base-10 that you’re used to and then another 6 values representing 10 through 15 which are written as the characters A through F. It’s much easier to remember bytes as two values than it is to remember 16 bits. I mention all of this because “FF” is identical in value to “11111111” and “F0” is identical to “11110000”.

You don’t see bit masking as much in managed languages, though you might if you get your hands dirty down in the transport layer doing things like socket management. But those coming from a C background are quite familiar with it. If you get a message over the wire, it may come in as a stream of bytes. But when things like network latency and accuracy are issues (or perhaps in embedded environments when space is at a premium) there is a tendency to squeeze every last modicum of value out of the space available. In a web application, representing a number 1 through 10 with data type “int” (which in managed languages will be a 32-bit/4-byte integer) is not really an issue, but in the environments I’ve mentioned, it’s digital blasphemy to waste that space. You only need 4 bits to represent that — the other 28 are wasted.

But the smallest unit typically dealt with in programming convention is the byte. So what to do? Well, enter bit masking. Let’s say I have two speakers on some embedded device that can be set to volume level 1-10 and I want to write a protocol for querying volume level for both. What I would do is define a message that was 1 byte long and have the first 4 bits of the byte represent the volume of one speaker and the second 4 represent the volume of the other. If I wanted just the value of speaker 2, I would want a way to preserve the first 4 bits of the byte I got while clobbering the second 4. Well, enter the bit mask. If I perform the operation int myValue = 0x0F & incomingSpeakerByte, I get what I’m looking for. Why? Well, let’s consider this.

The value 0x0F can be rewritten as 0x00001111 and the value of the speakers is whatever is coming back. Let’s say that it’s set to volume level 9, which can be rewritten as 0x1001. But, let’s say that the other speaker is set to volume level 4, which is 0x0100. That means that both speakers together are represented with the byte 0x01001001. If I set a local variable (where space is not a premium) to the byte I’m getting over the wire, it will be equal to 73, and that is nonsense in our 1-10 volume domain. But the bit mask operation performs the operation 0x00001111 & 0x01001001 to get the final value of 9 (0x00001001) correctly. The reason this works is that the “&” operation aligns the two sequences of bits and performs AND on them one by one, recording the value. This is called “bitwise and.” If you think back to the domination and identity laws from the last post, we’re using domination (x & 0 = 0) to clobber the first 4 bits and identity (x & 1 = x) to preserve the second 4, which are the only four that interest us. (Storing the other byte in a local int is a bit more involved, taking advantage of the aforementioned “bit-shifting,” and beyond the scope of this post). The long and short of it is that when you see bit masking operations performed, you now know what’s going on — the developer is using the same byte to store multiple values and using this technique to extract the ones he or she needs at the moment.

As to thinking in ones and zeroes, you might not be quite there yet, but this is a step in the right direction. It helps you to understand the fundamental building blocks of software based data and how we get from the physical world to the “soft” representation that we deal with. It also hopefully makes some of the weird-symbol bit manipulation tricks seem a little less like black magic. At the very least, I hope that instead of viewing some of this stuff with suspicion or hopelessness, you now have some basic concepts, terms to google and questions to ask.

Further Reading

  1. Wikipedia on bitwise operations
  2. HowStuffWorks on bits and bytes
  3. How many x in a y calculator
  4. Wiki on basic bit manipulation


Practical Math for Programmers: DeMorgan’s and Other Logical Equivalences

The “Why This Should Matter to You” Story

TL;DR: You read my post about conditional bloopers and thought, “what’s wrong with that?!?”

Let’s say that you recall the lesson from last time and you make sure that you also avoid statements like “if(A && (A || B))”. You draw out truth tables whenever you feel you have time (and when no one is looking because, if you’re honest with yourself, you feel a little awkward doing it when people are watching). And yet things like this still happen to code you’ve written:

gets refactored to:

When you notice this in the source history, you check with your truth tables, and sure enough, it’s right. You kind of get it, in the end, but you always feel like you’re a step behind.

Math Background

In Boolean algebra, truth tables are sort of the equivalent of rote arithmetic memorization such as learning multiplication tables and using them to perform long-hand multiplication of larger numbers. It’s a rote, brainless, and time-consuming algorithm that you follow when there isn’t something better to use to get you to the answer more quickly. But the good news is that if you memorize a different set of information — some axiomatic laws of equivalence — you can get a lot of answers more quickly and easily. Here is the series of equivalences paired with explanations that will really help you understand the finer points of simplifying your conditional logic.

Identity Laws:  (A ^ TRUE) = A and also (A v FALSE) = A.  These identity laws simply point out the identity of variables compared with special constants.  This is like talking about additive identity (x + 0 = x) or multiplicative identity (x*1 = x) in elementary algebra (i.e. what you learn in junior high).

Domination Laws:  (A ^ FALSE) = FALSE and also (A v TRUE) = TRUE.  These are the opposites of the identity laws and have elementary algebra annihilator equivalent (x*0=0) or (x/x = 1).  These operate simply in that any false in a conjunction or any true in a disjunction renders the entire expression false or true, respectively.

Double Negative:  ~~A = A.  (Probably more formally known as “double negation” or negation of negation, but I think this easier to remember).  This quite simply says that if something is not not there, then it is there or, in Boolean logic, two falses make a true.

Commutativity:  (A v B) = (B v A) and (A ^ B) = (B ^ A).  This simple but important law    shows that the order of arguments to the conjunction and disjunction operators do not matter.  Think back to your arithmetic days when you learned that 4+3 = 3+4.  Same principle.

Associativity:  (A v B) v C = A v (B v C) and (A ^ B) ^ C = A ^ (B ^ C).  You can think of this as demonstrating that the location of parantheses is irrelevant when all operations are conjunction or disjunction operations.  If you reach ever further back in your brain from arithmetic, you may remember this guy as 3 + (4 + 5) = (3 + 4) + 5.

Tautology:  (A v ~A) = True and (A ^ ~A) = False.  We went over this in more detail last time, but suffice it to say that conjunction and disjunction with a variable’s negation leads to vacuously true or false results.

Idempotence:  (A v A) = A and also (A ^ A) = A.  Idempotence in general is the idea that an operation can be applied multiple times but only have a single effect.  For a rather practical example, consider turning a light switch on; the first time you push up a light turns on, but subsequent pushes up don’t turn the light somehow “more on”.  Idempotence in Boolean expressions means that redundant conjunctions and disjunctions simply have no effect.

Distribution:  A ^ (B v C) = (A ^ B) v (A ^ C) and A v (B ^ C) = (A v B) ^ (A v C).  This is a powerful equivalence because it provides a means to “move” parentheses around in Boolean expressions and retain their exact truth values.  Consider this in English and there is sense to it: “If A is true and also either B or C is true then A or C is true and A or B is true.”  Should that be a little abstract for your taste, think of it conversationally “It’s raining and either I’m inside or I’m wet” is the same as “It’s raining and I’m wet or it’s raining and I’m inside.”

De Morgan’s Laws:  ~(A ^ B) = (~A v ~B) and ~(A v B) = (~A ^ ~B).  This is another heavy hitter like distribution except that this one allows us to “move” negations when simplifying Boolean expressions.  This one also makes sense conversationally: “If it’s not true that person X is male and person x is female, then either person X is not male or person X is not female.”  De Morgan’s Laws also have an interesting incarnation in higher order logics like first order logic, but that’s probably going to be beyond the scope of this series for quite some time.

Absorption: A v (A ^ B) = A and A ^ (A v B) = A.  This was the main example we covered last time, so you can work out the truth table if you like.  As it turns out, this is a formally defined equivalence as well.

How It Helps You

In the last post in this series, I covered truth tables, which are sort of like the Boolean algebra equivalent of long division.  They’re a mechanical, inelegant process by which you can brute force a correct answer.  Logical equivalences are algebraic axioms that provide you with the building blocks for deductive reasoning — the building blocks for really thinking things through and critically solving problems.

You don’t necessarily need to memorize these, but I would study them enough at least to have something tickle the back of your mind the next time that you’re staring at an incredibly complex looking conditional.  That something will be “I think we can do better than this” and then, even if you don’t have the equivalences memorized, at least you’ll know enough to go look them up.  Armed only with truth tables, this can feel like flailing around in the dark while looking for a pattern.  But armed with logical equivalences you’ll say “oh, that’s right — I can eliminate the not here and I can turn these ors into ands over here, and… yeah, much better.”

Memorizing these equivalence laws would be even better, but not for the sake of pedantry or winning obscure trivia games.  I say that memorizing them is better because if you memorize them, you will practice them and conditionals, particularly complex ones, will start to work themselves into simpler forms automatically in your mind.  You will see things like Neo in the Matrix when he’s about to defeat Agent Smith, if the movie were a lot more boring.  Point is, though, you’ll develop an extremely good sense for when conditional logic could be simplified, corrected, or clarified and that has powerful ramifications for the readability, correctness and maintainability of the code you work with.

To circle back to the small example of a real world situation, understanding these Boolean laws of inference is likely to put you ahead of the curve, rather than behind it, when it comes to understanding conditional simplification.  Rather than the person whose code is always getting refactored, you’ll be the one doing the refactoring and doing it confidently.

Further Reading

  1. Wikipedia
  2. A bit more circuit flavored, but with some applied examples at the bottom.
  3. An introduction to the idea of proofs in Boolean algebra
  4. Very theoretical, with formal proof for most of the inference laws.


Practical Math for Programmers: Truth Tables and Boolean Basics

The “Why This Should Matter to You” Story

Let’s say you’re responsible for maintaining a piece of code for the business that figures out recommendations for what to do at night. At the moment, it’s pretty simple:

Let’s say a project manager comes along and says to you, “what does the program currently recommend at night” and you say “well, if it’s late, the method recommends that you go to bed.” The project manager then says “okay, that’s great, but you should recommend that the user goes to bed in that case and also the case where either it’s late or you’re tired.” You dutifully make that happen:

You run the application, try out a few scenarios and see that it seems to make sense: you always wind up in bed when it’s late and generally wind up in bed when you’re tired, so you deliver it. You figure this makes sense since you’re usually tired when it’s late. Satisfied, you deliver the code and reason that it’s almost certainly fine and, if not, the architect will let you know in the next few days when she reviews the code.

A few days later, walking through the hallway, you see the architect coming and offer her a friendly wave. She frowns at you, opens her mouth for a second, then closes it, shakes her head, and walks by with a quick “hi”. A bit nonplussed, you return to your desk wondering what the issue was. She is quite busy so it’s no surprise she might not have time to talk right this second, so you decide to investigate further. Seeing nothing in your inbox from the architect, you decide to check through the code you’ve recently touched and you see this:

She’s put your code back to the way it was. Your sense of unease growing a bit, you walk over to her desk and ask why she reverted code that you added as per a new requirement. She testily tells you that she doesn’t have time for “Programming 101” right now and says she’ll go over some basic stuff with you next week.

What happened, and what should you do?

Math Background

Given that the architect’s displeasure revolves around the inclusion (and later omission) of an if condition, it’s a safe bet that we’re going to be covering Boolean logic in today’s post. Most people think of conditional control flow operators that you see in programs such as if, else, and switch/case in anecdotal and somewhat intuitive terms. Statements like “if my favorite sports team loses today, I will be sad” are the cornerstone of deductive reasoning, a mental exercise that is nearly universal to human communication and certainly not limited to programmers. And if you come to programming from a largely self-taught background with limited exposure to mathematics, this is probably the extent of your experience with conditional operations. You’re likely to reason about conditional operations the same way you reason about conversational ones.

This conversational reasoning works pretty well on simple code scenarios in code with situations like “if x then y else z” but it starts to get a little more interesting when there are compound conditions strung together with AND and OR. When this comes up, programmers and people in general without a mathematical background are likely to start struggling with keeping it straight. For example, consider this sentence: “If my dog is sleeping and my car is unlocked or if my car is locked but the keys are on the coffee table and I’m near the coffee table or else if someone can give me a ride, I’m going to go to the store.” Ready to start coding that up? Will you feel good about getting that right with an intuitive approach?

If the answer to that is “no”, then you’ve come to the right place. The math involved in Boolean operations will help you make sense of strings of conditions that are arbitrarily complex. First, let’s define some terms:

(Truth) Value:                             True (T) or False (F) in the Boolean world.
Variable:                                       A proxy that represents an unknown truth value (often A, B, C, etc)
(Basic) Operation:                      Algorithm applied to one or more values/variables to yield an output value/variable (e.g. “And”, “Or”, “Not”).
Expression:                                  Any grouping of operations, values and variable with a resultant truth value (e.g. “A”, “True”, “A AND True”, “A OR NOT(B)”).
Conjunction:                               Formal term for “AND” and represented in a variety of ways: (A AND B), (A^B), (AB)
Disjunction:                                Formal term for “OR” and represented in a variety of ways: (A OR B), (AvB), (A+B)
Negation (or Complement):    Formal term for “NOT” and represented in a variety of ways: (NOT(A)), (~A), (!A)
Truth Table:                               An enumeration of all possible expression inputs and their resultant truth output.

Notice that in this description of the basic operations, I do not use their common programming language equivalents such as “&&” and “||” (bang (!) operator being an exception). This is intentional. For one thing, these operators in their programming language contexts often have meaning beyond Boolean algebra (such as bitwise operations, which I will cover in a future post) which would distract from this post and for another thing, it is important to keep in mind that we’re operating outside of a programming contexts. Here, our expressions are propositions rather than representative of some kind of state of a program.

The truth table definition is a bit ambiguous, so let’s do some examples of what that looks like. In this case, a picture is worth a thousand words. Here is a truth table for the variable A:

On the left is the input, which is simply the variable A, and on the right is the value of that expression with the left hand truth value used for A. It’s easier to get the idea with another example or two. Here’s negation:

If we substitute T for A, the value of ~A is F, and vice-versa if we use the value F for A. Here’s an example with two variables:

As you can see, this is pretty simple stuff. You simply take all possible combination of variables, plug them in, and evaluate the results to see what the value of the expression is in each case. Let’s try this with the condition that so annoyed the architect:

Looking at this table, do you notice anything interesting? The expression is true in the first two cases and false in the second two. Just like the “A” variable. In fact, it’s identical to the A variable. The B variable doesn’t matter at all!

How It Helps You

Truth tables are not rocket science. In fact, they’re one of the easiest mathematical concepts you’re likely to encounter since they are simply brute force documentation of an expression. But they will help you see patterns. You’ll see patterns directly in the manner we laid out here, but as you get familiar with truth table patterns, you’ll start to apply them within larger conditionals as well and even to recognize intuitively and automatically situations that can be simplified. Or, if you don’t recognize them immediately, you might at least suspect simplifications are possible and write out a truth table to confirm or refute your hypothesis.

This has a powerful impact on your code. Instead of large and laborious conditionals strung together, you will gravitate toward the simplest possible representations. Not only does this keep the code clean and readable, but it also helps with your clarity of thought from a requirements and reasoning perspective. Consider our example situation. Isn’t it nice to be able to say to the PM, “hey, we should talk about what you want here because the requirement you added actually has no effect as you’ve stated it.” Without a knowledge of Boolean basics, you’d have a hard time recognizing this yourself, much less explaining it to someone else.

And finally, this understanding gains you respect from your fellow programmers. If you write big, unwieldy conditional statements, often containing redundancies or tautologies (statements that always evaluate to true, such as “A OR NOT(A)”), other people will come to view you as sloppy or poorly informed as far as programming goes. The example with the architect isn’t far-fetched at all. It pays to have a basic understanding of Boolean logic and some of its rules for the sake of being taken seriously.

You don’t need to be an expert in the various transforms and axioms in Boolean algebra (which I will cover next time) to get this right. A truth table may seem cumbersome and it may feel sort of plodding and non-elegant, but it’s a surefire way to get things right, look for patterns, and feel more comfortable in the conditionals you’re writing. Practice them and know them and you won’t be sorry.

Further Reading

  1. There are some other operators (NAND, XOR, XNOR, NOR) that can be composed from AND, OR, and NOT, which you can read about here.
  2. NAND (and NOR) has the interesting property of being functionally complete in terms of logic (meaning all other operators can be derived from it). You can read about that here, though it’s framed in terms of circuits rather than pure math.
  3. This is a cool site that will generate actual truth tables for you if you punch in an expression.
  4. Here is a very rigorous, but math jargon-heavy reading about Boolean algebra, how it is constructed from axioms, and how it relates to other mathematical systems. Warning — this is not a light read.
  5. On the flip side of jargon-heavy is this practical, conversational series of examples of truth tables and the reasoning behind them.