DaedTech

Stories about Software

By

Because I Said So Costs You Respect

Do you remember being a kid, old enough to think somewhat deductively and logically, but not old enough really to understand how the world works? Do you remember putting together some kind of eloquent argument about why you should be able to sleep at a friend’s house or stay out later than normal, perfecting your reasoning, presentation and polish only to be rebuffed? Do you remember then having a long back and forth, asking why to everything, only to be smacked in the face with the ultimate in trump cards: “because I’m your parent and I say so?” Yeah… me too. Sucked, didn’t it?

There are few things in life more frustrating than pouring a good amount of effort into something only to have it shot down without any kind of satisfying explanation of the rationale. For children this tends to be unfortunate for self interested reasons: “I want to do X in the immediate future and I can’t.”  But as you get older and are motivated by more complex and nuanced concerns, these rejections get more befuddling and harder to understand.  A child making a case for why he should own a Red Rider BB Gun will understand on some level the parental objection “it’s dangerous” and that the parental objection arises out of concern for his welfare. So when the “enough — I’m your parent and I say so” comes up it has the backing of some kind of implied reasoning and concern.  An adult making a case that adopting accounting software instead of doing things by hand would be a money saving investment will have a harder time understanding  an answer of “no, we aren’t going to do that because I’m your boss and I say so.”

This is hard for adults to understand because we are sophisticated and interpersonally developed enough to understand when our goals align with those of others or of organizations. In other words, we reasonably expect an answer of “no” to the question “can I go on vacation for 8 months and get paid?” because this personal goal is completely out of whack with others and with an organization’s. So even if the explanation or reasoning aren’t satisfying, we get it. The “because I said so” is actually unnecessary since the answer of “that makes no sense for anyone but you” is perfectly reasonable. But when goals align and it is the means, rather than the ends, that differ, we start to have more difficulty accepting when people pull rank.

I remember some time back being asked to participate in generating extra documentation for an already documentation-heavy process. The developers on the project were taking requirements documents as drawn up by analysts and creating “requirements analysis” documents, and the proposal was to add more flavors of requirements documents to this. So instead of the analysts and developers each creating their own single word document filled with paragraph narratives of requirements, they were now being asked to collaborate on several each with each subsequent version adding more detail to the document. So as a developer, I might write my first requirements document in very vague prose, to be followed by a series of additional documents, each more detailed than the last.

I objected to this proposal (and really even to the original proposal). What I wanted to do was capture the requirements as data items in a database rather than a series of documents filled with prose. And I didn’t like the idea of having several documents with each one being a “more fleshed out” version of the last document — there’s a solution for that and it’s called “version control”. But when I raised these objections to the decision makers on the project, I was rebuffed. If you’re curious as to what the rationale for favoring the approach I’ve described over the one I suggest, I am as well, even to this day. You see, I never received any explanation other than a vague “well, it might not be the greatest, but we’re just going to go with it.” This explanation neither explained the benefit of the proposed approach nor any downside to my approach. Instead, I was a kid again, hearing “because I’m your parent and I say so.” But I wasn’t a little kid asking to stay out late — I was an adult with a different idea for how to achieve the same productivity and effectiveness goals as everyone else.

In the end, I generated the documents I was required to generate. It wasn’t the end of the world, though I never did see it as anything other than a waste of time. As people collaborating toward a larger goal, it will often be the case that we have to do things we don’t like, agree with or approve of, and it might even be the case that we’re offered no explanation at all for these things. That is to be expected in adult life, but I would argue that it should be minimized because it chips away at morale and satisfaction with one’s situation.

Once, twice, or every now and then, most people will let a “just do it because I say so” roll off their back. Some people might let many more than that slip by, while others might get upset earlier in the process. But pretty much everyone is going to have a limit with this kind of thing. Pretty much everyone will have a threshold of “because I say so” responses beyond which they will check out, tune out, leave, blow up, or generally lose it with that in some form. So I strongly recommend avoiding the tempting practice to “pull rank” and justify decisions with “because it’s my project” or “because I’ve decided that.” It’s sometimes nice not to have to justify your decisions — especially to someone with less experience than you and when you’re in a hurry — but the practice of defending your rationale keeps you sharp and on your toes, and it earns the respect of others. “Because I say so” is a well that you can only go to so many times before it dries up and leaves a desert of respect and loyalty. Don’t treat your coworkers like children when they want to know “why” and when they have legitimate questions — they deserve better than “because I’m in charge and I say so.”

By

TDD For Breaking Problems Apart

If I have to write some code that’s going to do something rather complex, it’s easy for my mind to start swimming with possibilities, as mentioned in the magic boxes post. In that post, I talked about thinking in abstractions and breaking problems apart and alluded to TDD being a natural fit for this paradigm without going into much detail. Today and in my next post about this I’m going to go into that detail.

Let’s say I’m tasked with coding some class or classes that keep track of bowling scores. I don’t think “okay, what is my score at the start of a bowling game?” I think things like “I’ll need to have some iteration logic that remembers back at least two frames, since that’s how far back we can go, and I’ll probably need something to look ahead to handle the 10th frame, and maybe there’s a design pattern for that, and…” Like I said, swimming. These are the insane ramblings of an over-caffeinated maniac more than they’re a calm and methodical solution to a somewhat, but not terribly complex problem. In a way, this is like premature optimization. I’m solving problems that I anticipate having rather than problems that I actually have right at this moment.

TDD is thus like a deep breath, a relaxing cup of some hot beverage, and perhaps a bit of meditation or zoning out. All of this noise fades out of my head and I can actually start figuring out how things work. The reason for this is that the essence of TDD is breaking the task into small, manageable problems and solving those while simultaneously ensuring that they stay solved. It’s like a check-list. Forget all of this crap about iterators and design patterns — let’s establish a base case: “Score at the beginning of a game is zero.” That’s pretty easy to solve and pretty hard to get frantic over.

With TDD, the first that I’m going to do is write a test and I’m going to write only enough of a test to fail (which includes writing something that doesn’t compile. So, let’s do that:

[TestClass]
public class Constructor
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void Initializes_Score_To_Zero()
    {
        var scoreCalculator = new BowlingScoreCalculator();
    }
}

Alright, since there is no such type as “BowlingScoreCalculator”, this test fails by virtue of non-compiling. I then define the type and the test goes green (I’m using NCrunch, so I’m never actually building or running anything — I just see dots change colors). Next, I want to assert that the score property is equal to zero after the constructor executes:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }
}

public class BowlingScoreCalculator
{
    public BowlingScoreCalculator()
    {

    }
}

This also fails, since that property doesn’t exist, and I solve this problem by declaring an auto-implemented property, which actually makes the whole test pass. And just like that, I’ve notched my first passing test and step toward a functional bowling calculator. The game starts with a score of zero.

What’s next? Well, I suppose the simplest thing that we should say is that if I bowl a frame with a score of zero for the first throw and one on the second throw, my score is now 1. Okay, so what is a frame, and what kind of data structure should I use to represent it, and what would be the ideal API for that, and is score really best represented as a property, and — BZZT!! Stop it. We’re solving small, simple problems one at a time. And the only problem I have at the moment is “lack of failing test”. I’m going to quickly decide that the mechanism for bowling a frame is going to be BowlFrame(int, int) so that I know the name of my next sub-test class for naming purposes. Writing code until something fails, I get:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }

    [TestClass]
    public class BowlFrame
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_0_And_1_Results_In_Score_1()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(0, 1);
        }
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public BowlingScoreCalculator()
    {    

    }
}

This fails because there is no “bowl frame” method, so I add it. CodeRush’s “declare method” refactoring does the trick nicely, populating the new method with a thrown NotImplementedException, which gives me a red test instead of a non-compile. I have to make the test pass before moving on, so I delete it by deleting the throw, and then I add an assert that score is equal to 1. This fails, and I make it pass:

[TestClass]
public class Constructor
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void Initializes_Score_To_Zero()
    {
        var scoreCalculator = new BowlingScoreCalculator();

        Assert.AreEqual(0, scoreCalculator.Score);
    }
}

[TestClass]
public class BowlFrame
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_0_And_1_Results_In_Score_1()
    {
        var scoreCalculator = new BowlingScoreCalculator();
        scoreCalculator.BowlFrame(0, 1);

        Assert.AreEqual(1, scoreCalculator.Score);
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public BowlingScoreCalculator()
    {

    }

    public void BowlFrame(int throw1, int throw2)
    {
        Score = 1;
    }
}

Now, with a green test, I have the option to solve problems in existing code (i.e. “refactor”). For instance, I don’t like that I have a useless, empty constructor, so I delete it and note that my tests still pass. This is another easy problem that I can solve with confidence.

From here, I think I’d like a non-obtuse implementation of scoring that first frame. I think I’ll test that if I bowl a 2 and then a 3, the score is equal to five. For the first time, I’m not going to have any non-compiling failings, so I write some code and see red, obviously, which I get rid of by making the whole thing look like this:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }

    [TestClass]
    public class BowlFrame
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_0_And_1_Results_In_Score_1()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(0, 1);

            Assert.AreEqual(1, scoreCalculator.Score);
        }

        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_2_And_3_Results_In_Score_5()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(2, 3);

            Assert.AreEqual(5, scoreCalculator.Score);
        }
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public void BowlFrame(int throw1, int throw2)
    {
        Score = throw1 + throw2;
    }
}

All right, the calculator is now doing a good job of handling a low scoring first frame. So, let’s correct a little thing or two with the refactor cycle, making sure that whatever we do is a problem that’s easily solvable, isolated, and small in scope. I’m thinking I don’t like the magic numbers 2, 3 and 5 in that last test. Let’s try making it look like this:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void With_Throws_2_And_3_Results_In_Score_5()
{
    var scoreCalculator = new BowlingScoreCalculator();
    const int firstThrow = 2;
    const int secondThrow = 3;
    scoreCalculator.BowlFrame(firstThrow, secondThrow);

    Assert.AreEqual(firstThrow + secondThrow, scoreCalculator.Score);
}

Yes, it’s perfectly acceptable to refactor your unit tests during the “refactor” phase. Treat these guys as first class code and keep them clean or nobody will want them around. I have another bone to pick with this code, which is the duplication of the constructor logic in each test. Let’s fix that too:

[TestClass]
public class BowlFrame
{
    private static BowlingScoreCalculator Target { get; set; }

    [TestInitialize()]
    public static void BeforeEachTest()
    {
        Target = new BowlingScoreCalculator();
    }

    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_0_And_1_Results_In_Score_1()
    {
        Target.BowlFrame(0, 1);

        Assert.AreEqual(1, Target.Score);
    }

    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_2_And_3_Results_In_Score_5()
    {
        const int firstThrow = 2;
        const int secondThrow = 3;
        Target.BowlFrame(firstThrow, secondThrow);

        Assert.AreEqual(firstThrow + secondThrow, Target.Score);
    }
}

There, that looks better to me now. Now, what’s wrong with the production code? What problem can we solve? I can think of a few things: we can bowl more than 10 per frame, we can bowl negative numbers, and this thing will only keep track of the most recent frame. There are many other issues as well, but we’re already trending toward overload here, so let’s get back on Easy Street and figure out what to do if the user gives us goofy input.

I write a test that fails:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Throws_Exception_On_Negative_Argument()
{
    ExtendedAssert.Throws(() => Target.BowlFrame(-1, 0));
}

And then make it pass (first by declaring the nested exception and then by throwing it when the first frame is negative). I also take a shortcut in terms of obtuseness of my TDD here and make it throw the exception if either parameter is negative. I consider this acceptable, personally, since I believe that within the red-green-refactor discipline it’s a matter of some discretion how simple “the simplest thing” is. I’m not going to be accepting negative parameters for one and not the other and I know it. At any rate, after getting this to pass, I now refactor my method parameter names to be more descriptive and set the Score setter to private (which I should have done at first anyway) and the production class looks like this:

public class BowlingScoreCalculator
{
    public class FrameUnderflowException : Exception { }

    public int Score { get; private set; }

    public void BowlFrame(int firstThrow, int secondThrow)
    {
        if (firstThrow < 0 || secondThrow < 0)
            throw new FrameUnderflowException();
        Score = firstThrow + secondThrow;
    }
}

Now I’m going to tackle the similar problem of allowing too much scoring for a frame. I want to solve the problem that scores of greater than 10 are currently allowed and I’m going to do this with the test case for 11. TDD is not a smoke testing approach nor is it an exhaustive approach, so I’m not going to write tests for 11, 12, 13… 200 or anything like that. As such, I try to hit edge cases whenever possible and that’s what I’ll do here:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Throws_Exception_On_Score_Of_11()
{
    ExtendedAssert.Throws(() => Target.BowlFrame(10, 1));
}

I make this pass, and then I go back and then decide I’m not a huge fan of that “10” in both my test and in the production class. Zero I can let slide because of it’s sort of universal value in scoring of competitions, but 10 deserves a descriptive name. I’m going to call it “Mark” since that’s what a frame of 10 is known as in bowling.

Now that we’ve sufficiently guarded against bad inputs, I’d say it’s time to start thinking about aggregation. The easiest way to do that is to have a frame of 1 and then another frame of one, and make sure that we have a score of 2:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Sets_Score_To_2_After_2_Frames_With_Score_Of_1_Each()
{
    Target.BowlFrame(1, 0);
    Target.BowlFrame(1, 0);

    Assert.AreEqual(2, Target.Score);
}

How cool is it that I fix this by changing “Score = firstThrow + secondThrow;” to “Score += firstThrow + secondThrow;” Talk about the simplest solution possible!

Now, I don’t like the way that test is written with 5 int literals in there. This is telling me that it might be time to rethink the way I’m handling frame. I don’t really want to think too much about it now, but I know that there’s going to be a hard-fail when I get to the 10th frame, which can have three throws so I’m already not married to this implementation. And now it’s already starting to feel awkward.

So what if I created a simple frame object? How would that look? Well, I’ll cover that next time as I flesh out this exercise. I’m hoping I’ve at least sold you somewhat on the notion that TDD forces you to chunk problems into manageable pieces and solve them that way without getting carried away. Next time I’ll circle back a bit to the idea of “magic boxes” as we decide how to divide up the concept of “frame” and “game” while adhering to the practice of TDD.

By

You Aren’t God, So Don’t Talk Like Him

Queuing Up Some Rage

Imagine that you’re talking to an acquaintance, and you mention that blue is your favorite color. The response comes back:

Acquaintance: You’re completely wrong. Red is the best color.

How does the rest of this conversation go? If you’re sane…

Acquaintance: That’s wrong. Red is the best color.
You: Uh, okay…

This, in fact, is really the only response that avoids a terminally stupid and probably increasingly testy exchange. The only problem with this is that the sane approach is also perceived as something between admission of defeat and appeasement. You not fighting back might be perceived as weakness. So, what do you do? Do you launch back with this?

Acquaintance: That’s wrong. Red is the best color.
You: No, it is you who is wrong. You’d have to be an idiot to like red!

If so, how do you think this is going to go?

Acquaintance: That’s wrong. Red is the best color.
You: No, it is you who is wrong. You’d have to be an idiot to like red!
Acquaintance: Ah, well played. I’ve changed my mind.

Yeah, I don’t think that’s how it will go either. Probably, it will turn out more like this:

Acquaintance: That’s wrong. Red is the best color.
You: No, it is you who is wrong. You’d have to be an idiot to like red!
Acquaintance: You’re the idiot, you stupid blue fanboy!
You: Well, at least my favorite color isn’t the favorite color of serial killers and Satan!
Acquaintance: Go shove an ice-pick up your nose. I hope you die!

Well, okay, maybe it will take a little longer to get to that point, perhaps with some pseudo-intellectual comparisons to Hitler and subtle ad hominems greasing the skids of escalation. If you really want to see this progression in the wild, check the comments section of any tech article about an Apple product. But the point is that it won’t end well.

Looking back, what is the actual root cause of the contention? The fact that you like blue and your acquaintance likes red? That doesn’t seem like the sort of thing that normally gets the adrenaline pumping. Is it the fact that he told you that you were wrong? I think this cuts closer to the heart of the matter, but this ultimately isn’t really the problem, either. So what is?

Presenting Opinions as Facts

The heart of the issue here, I believe, is the invention of some arbitrary but apparently universal truth. In other words, the subtext of what your acquaintance is saying is, “There is a right answer to the favorite color question, and that right answer is my answer because it’s mine.” The place where the conversation goes off the rails is the place at which one of the participants declares himself to be the ultimate Clearinghouse of color quality. So, while the “you’re wrong” part may be obnoxious, and it may even be what grinds the listener’s teeth in the moment, it’s just a symptom of the actual problem: an assumption of objective authority over a purely subjective matter.

To drive the point home, consider a conversation with a friend or family member instead of a mere acquaintance. Consider that in this scenario the “you’re wrong” would probably be good-natured and said in jest. “Dude, you’re totally wrong–everyone knows red is the best color!” That would roll off your back, I imagine. The first time, anyway. And probably the second time. And the third through 20th times. But, sooner or later, I’m pretty sure that would start to wear on you. You’d think to yourself, “Is there any matter of opinion about which I’m not ‘wrong,’ as he puts it?”

In the example of favorite color and other things friends might discuss, this seems pretty obvious. Who would seriously think that there was an actual right answer to “What’s your favorite color?” But what about the aforementioned Apple products versus, say, Microsoft or Google products? What about the broader spectrum of consumer products, including deep dish versus thin crust pizza or American vs Japanese cars? Budweiser or Miller? Maybe an import or a microbrew? What about sports teams? Designated hitter or not? Soccer or football?

And what about technologies and programming languages and frameworks? Java versus .NET? Linux versus Windows? Webforms vs ASP MVC? What about finer granularity concerns? Are singletons a good idea or not? Do curly braces belong on the same line as a function definition or the next line? Layered or onion architecture? Butter side up or butter side down? (Okay, one of those might have been something from Dr Seuss.)

It’s All in the Phrasing

With all of these things I’ve listed, particularly the ones about programming and others like them, do you find yourself lapsing into declarations of objective truth when what you’re really doing is expressing an opinion? I bet you do. I know I do, from time to time. I think it’s human nature, or at the very least it’s an easy way to try to add additional validity to your take on things. But it’s also a logical fallacy (appeal to authority, with you as the authority, or, as I’ve seen it called, confusing fact with opinion.) It’s a fallacy wherein the speaker holds himself up as the arbiter of objective truth and his opinions up as facts. Whatever your religious beliefs may be, that is a role typically reserved for a deity. I’m pretty sure you’re not a deity, and I know that I’m not one, so perhaps we should all, together, make an effort to be clear if we’re stating facts (“two plus one is four”) or if we’re expressing beliefs or opinions (“Three is the absolute maximum number of parameters I like to see for a method”).

Think of how you you would react to the following phrases:

  • I like giant methods.
  • I believe there’s no need to limit the number of control flow statements you use.
  • I would have used a service locator there where you used direct dependency injection.
  • I prefer to use static methods and especially static state.
  • I wish there were more coupling between these modules.
  • I am of the opinion that unit testing isn’t that important.

You’re probably thinking “I disagree with those opinions.” But your hackles likely aren’t raised. Your face isn’t flushed, and your adrenaline isn’t pumping in anticipation of an argument against someone who just indicted your opinions and your way of doing things. You aren’t on the defensive. Instead, you’re probably ready to argue the merits of your case in an attempt to come to some mutual understanding, or, barring that, to “agree to disagree.”

Now think of how you’d react to these statements.

  • Reducing the size of your methods is a waste of time.
  • Case statements are better than polymorphism.
  • If you use dependency injection, you’re just wrong.
  • Code without static methods is bad.
  • The lack of coupling between these modules was a terrible decision.
  • Unit testing is a dumb fad.

How do you feel now? Are your hackles raised a little bit, even though you know I don’t believe these things? Where the language in the first section opened the door for discussion with provocative statements, the language in this section attempts to slam that door shut, not caring if your fingers are in the way. The first section states the speaker’s opinions, where the language in the second indicts the listener’s. Anyone wanting to foster a cooperative and pleasant environment would be well served to favor things stated in the fashion of the first set of statements. It may be tempting to make your opinions seem more powerful by passing them off as facts, but it really just puts people off.

Caveats

I want to mention two things as a matter of perspective here. The first is that it would be fairly easy to point out that I write a lot of blog posts and give them titles like, “Testable Code Is Better Code,” and, “You’re Doin’ It Wrong,” to say nothing of what I might say inside the posts. And while that’s true, I would offer the rationale that pretty much everything I might post on a blog that isn’t a simple documentation of process is going to be a matter of my opinion, so the “I think” kind of goes without saying here. I can assure you that I do my best in actual discussions with people to qualify and make it clear when I’m offering opinions. (Though, as previously mentioned, I’m sure I can improve in this department, as just about anyone can.)

The second caveat is that what I’m saying is intended to apply to matters of complexity that are naturally opinions by their nature. For instance, “It’s better to write unit tests” is necessarily a statement of opinion since qualifying words like “better” invite ambiguity. But if you were to study 100 projects and discover that the ones with unit tests averaged 20% fewer defects, this would simply be a matter of fact. I am not advocating downgrading facts to qualified, wishy-washy opinions. What I am advocating is that we all stop ‘upgrading’ our opinions to the level of fact.

By

How to Create Good Abstractions: Oracles and Magic Boxes

Oracles In Math

When I was a freshman in college, I took a class called “Great Theoretical Ideas In Computer Science”, taught by Professor Steven Rudich. It was in this class that I began to understand how fundamentally interwoven math and computer science are and to think of programming as a logical extension of math. We learned about concepts related to Game Theory, Cryptography, Logic Gates and P, NP, NP-Hard and NP-Complete problems. It is this last set that inspires today’s post.

This is not one of my “Practical Math” posts, so I will leave a more detailed description of these problems for another time, but I was introduced to a fascinating concept here: that you can make powerful statements about solutions to problems without actually having a solution to the problem. The mechanism for this was an incredibly cool-sounding concept called “The Problem Solving Oracle”. In the world of P and NP, we could make statements like “If I had a problem solving oracle that could solve problem X, then I know I could solve problem Y in polynomial (order n squared, n cubed, etc) time.” Don’t worry if you don’t understand the timing and particulars of the oracle — that doesn’t matter here. The important concept is “I don’t know how to solve X, but if I had a machine that could solve it, then I know some things about the solutions to these other problems.”

It was a powerful concept that intrigued me at the time, but more with grand visions of fast factoring and solving other famous math problems and making some kind of name for myself in the field of computer science related mathematics. Obviously, you’re not reading the blog of Fields Medal winning mathematician, Erik Dietrich, so my reach might have exceeded my grasp there. However, concepts like this don’t really fall out of my head so much as kind of marinate and lie dormant, to be re-appropriated for some other use.

Magic Boxes: Oracles For Programmers

One of the most important things that we do as developers and especially as designers/architects is to manage and isolate complexity. This is done by a variety of means, many of which have impressive-sounding terms like “loosely coupled”, “inverted control”, and “layered architecture”. But at their core, all of these approaches and architectural concepts have a basic underlying purpose: breaking problems apart into isolated and tackle-able smaller problems. To think of this another way, good architecture is about saying “assume that everything else in the world does its job perfectly — what’s necessary for your little corner of the world to do the same?”

That is why the Single Responsibility Principle is one of the most oft-cited principles in software design. This principle, in a way, says “divide up your problems until they’re as small as they can be without being non-trivial”. But it also implies “and just assume that everyone else is handling their business.”

Consider this recent post by John Sonmez, where he discusses deconstructing code into algorithms and coordinators (things responsible for getting the algorithms their inputs, more or less). As an example, he takes a “Calculator” class where the algorithms of calculation are mixed up with the particulars of storage and separates these concerns. This separates the very independent calculation algorithm from the coordinating storage details (which are of no consequence to calculations anyway) in support of his point, but it also provides the more general service of dividing up the problem at hand into smaller segments that take one another granted.

Another way to think of this is that his “Calculator_Mockless” class has two (easily testable) dependencies that it can trust to do their jobs. Going back to my undergrad days, Calculator_Mockless has two Oracles: one that performs calculations and the other that stores stuff. How do these things do their work? Calculator_Mockless doesn’t know or care about that; it just provides useful progress and feedback under the assumption that they do. This is certainly reminiscent of the criteria for “Oracle”, an assumption of functionality that allows further reasoning. However, “Oracle” has sort of a theoretical connotation in the math sense that I don’t intend to convey, so I’m going to adopt a new term for this concept in programming: “Magic Box”. John’s Calculator_Mockless says “I have two magic boxes — one for performing calculations and one for storing histories — and given these boxes that do those things, here’s how I’m going to proceed.”

How Spaghetti Code Is Born

It’s one thing to recognize the construction of Magic Boxes in the code, but how do you go about building and using them? Or, better yet, go about thinking in terms of them from the get-go? It’s a fairly sophisticated and not always intuitive method for deconstructing programming problems.

nullTo see what I mean, think of being assigned a hypothetical task to read an XML file full of names (right), remove any entries missing information, alphabetize the list by last name, and print out “First Last” with “President” pre-pended onto the string. So, for the picture here, the first line of the output should be “President Grover Cleveland”. You’ve got your assignment, now, quick, go – start picturing the code in your head!

What did you picture? What did you say to yourself? Was it something like “Well, I’d read the file in using the XDoc API and I’d probably use an IList<> implementer instead of IEnumerable<> to store these things since that makes sorting easier, and I’d probably do a foreach loop for the person in people in the document and while I was doing that write to the list I’ve created, and then it’d probably be better to check for the name attributes in advance than taking exceptions because that’d be more efficient and speaking of efficiency, it’d probably be best to append the president as I was reading them in rather than iterating through the whole loop again afterward, but then again we have to iterate through again to write them to the console since we don’t know where in the list it will fall in the first pass, but that’s fine since it’s still linear time in the file size, and…”

And slow down there, Sparky! You’re making my head hurt. Let’s back up a minute and count how many magic boxes we’ve built so far. I’m thinking zero — what do you think? Zero too? Good. So, what did we do there instead? Well, we embarked on kind of a willy-nilly, scattershot, and most importantly, procedural approach to this problem. We thought only in terms of the basic sequence of runtime operations and thus the concepts kind of all got jumbled together. We were thinking about exception handling while thinking about console output. We were thinking about file reading while thinking about sorting strings. We were thinking about runtime optimization while we were thinking about the XDocument API. We were thinking about the problem as a monolithic mass of all of its smaller components and thus about to get started laying the bedrock for some seriously weirdly coupled spaghetti code.

Cut that Spaghetti and Hide It in a Magic Box

Instead of doing that, let’s take a deep breath and consider what mini-problems exist in this little assignment. There’s the issue of reading files to find people. There’s the issue of sorting people. There’s the issue of pre-pending text onto the names of people. There’s the issue of writing people to console output. There’s the issue of modeling the people (a series of string tuples, a series of dynamic types, a series of anonymous types, a series of structs, etc?). Notice that we’re not solving anything — just stating problems. Also notice that we’re not talking at all about exception handling, O-notation or runtime optimization. We already have some problems to solve that are actually in the direct path of solving our main problem without inventing solutions to problems that no one yet has.

So what to tackle first? Well, since every problem we’ve mentioned has the word “people” in it and the “people” problem makes no mention of anything else, we probably ought to consider defining that concept first (reasoning this way will tell you what the most abstract concepts in your code base are — the ones that other things depend on while depending on nothing themselves). Let’s do that (TDD process and artifacts that I would normally use elided):

public struct Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public override string ToString()
    {
        return string.Format("{0} {1}", FirstName, LastName);
    }
}

Well, that was pretty easy. So, what’s next? Well, the context of our mini-application involves getting people objects, doing things to them, and then outputting them. Chronologically, it would make the most sense to figure out how to do the file reads at this point. But “chronologically” is another word for “procedurally” and we want to build abstractions that we assemble like building blocks rather than steps in a recipe. Another, perhaps more advantageous way would be to tackle the next simplest task or to let the business decide which functionality should come next (please note, I’m not saying that there would be anything wrong with implementing the file I/O at this point — only that the rationale “it’s what happens first sequentially in the running code” is not a good rationale).

Let’s go with the more “agile” approach and assume that the users want a stripped down, minimal set of functionality as quickly as possible. This means that you’re going to create a full skeleton of the application and do your best to avoid throw-away code. The thing of paramount importance is having something to deliver to your users, so you’re going to want to focus mainly on displaying something to the user’s medium of choice: the console. So what to do? Well, imagine that you have a magic box that generates people for you and another one that somehow gets you those people. What would you do with them? How ’bout this:

public class PersonConsoleWriter
{
    public void WritePeople(IEnumerable<Person> people)
    {
        foreach (var person in people)
            Console.Write(person);
    }
}

Where does that enumeration come from? Well, that’s a problem to deal with once we’ve handled the console writing problem, which is our top priority. If we had a demo in 30 seconds, we could simply write a main that instantiated a “PersonConsoleWriter” and fed it some dummy data. Here’s a dead simple application that prints, well, nothing, but it’s functional from an output perspective:

public class PersonProcessor
{
    public static void Main(string[] args)
    {
        var writer = new PersonConsoleWriter();
        writer.WritePeople(Enumerable.Empty<Person>());
    }
}

What to do next? Well, we probably want to give this thing some actual people to shove through the pipeline or our customers won’t be very impressed. We could new up some people inline, but that’s not very impressive or helpful. Instead, let’s assume that we have a magic box that will fetch people out of the ether for us. Where do they come from? Who knows, and who cares — not main’s problem. Take a look:

public class PersonStorageReader
{
    public IList<Person> GetPeople()
    {
        throw new NotImplementedException();
    }
}

Alright — that’s pretty magic box-ish. The only way it could be more so is if we just defined an interface, but that’s overkill for the moment and I’m following YAGNI. We can add an interface if thing later needs to pull customers out of a web service or something. At this point, if we had to do a demo, we could simply return the empty enumeration instead of throwing an exception or we could dummy up some people for the demo here. And the important thing to note is that now the thing that’s supposed to be generating people is the thing that’s generating the people — we just have to sort out the correct internal implementation later. Let’s take a look at main:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople();

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

Well, that’s starting to look pretty good. We get people from the people reader and feed them to the people console writer. At this point, it becomes pretty trivial to add sorting:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople().OrderBy(p => p.LastName);

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

But, if we were so inclined, we could easily look at main and say “I want a magic box that I hand a person collection to and it gives me back a sorted person collection, and that could be stubbed out as follows:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople();

    var sorter = new PersonSorter();
    var sortedPeople = sorter.SortList(people);

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

The same kind of logic could also be applied for the “pre-pend the word ‘President'” requirement. That could pretty trivially go into the console writer or you could abstract it out. So, what about the file logic? I’m not going to bother with it in this post, and do you know why? You’ve created enough magic boxes here — decoupled the program enough — that it practically writes itself. You use an XDocument to pop all Person nodes and read their attributes into First and Last name, skipping any nodes that don’t have both. With Linq to XML, how many lines of code do you think you need? Even without shortcuts, the jump from our stubbed implementation to the real one is small. And that’s the power of this approach — deconstruct problems using magic boxes until they’re all pretty simple.

Also notice another interesting benefit which is that the problems of runtime optimization and exception handling now become easier to sort out. The exception handling and expensive file read operations can be isolated to a single class and console writes, sorting, and other business processing need not be affected. You’re able to isolate difficult problems to have a smaller surface area in the code and to be handled as requirements and situation dictate rather than polluting the entire codebase with them.

Pulling Back to the General Case

I have studiously avoided discussion of how tests or TDD would factor into all of this, but if you’re at all familiar with testable code, it should be quite apparent that this methodology will result in testable components (or endpoints of the system, like file readers and console writers). There is a deep parallel between building magic boxes and writing testsable code — so much so that “build magic boxes” is great advice for how to make your code testable. The only leap from simply building boxes to writing testable classes is to remember to demand your dependencies in constructors, methods and setters, rather than instantiating them or going out and finding them. So if PersonStorageReader uses an XDocument to do its thing, pass that XDocument into its constructor.

But this concept of magic boxes runs deeper than maintainable code, TDD, or any of the other specific clean coding/design principles. It’s really about chunking problems into manageable bites. If you’re even writing code in a method and you find yourself thinking “ok, first I’ll do X, and then-” STOP! Don’t do anything yet! Instead, first ask yourself “if I had a magic box that did X so I didn’t have to worry about it here and now, what would that look like?” You don’t necessarily need to abstract every possible detail out of every possible place, but the exercise of simply considering it is beneficial. I promise. It will help you start viewing code elements as solvers of different problems and collaborators in a broader solution, instead of methodical, plodding steps in a gigantic recipe. It will also help you practice writing tighter, more discoverable and usable APIs since your first conception of them would be “what would I most like to be a client of right here?”

So do the lazy and idealistic thing – imagine that you have a magic box that will solve all of your problems for you. The leap from “magic box” to “collaborating interface” to “implemented functionality” is much simpler and faster than it seems when you’re isolating your problems. The alternative is to have a system that is one gigantic, procedural “magic box” of spaghetti and the “magic” part is that it works at all.

By

How Not to Be Blocked

In a recent post, I talked about how demoralizing it can be to sit around with nothing to do while waiting for someone else to finish a task that you need, fix something that you need, assign you something, etc. I think this is fairly universally known as “being blocked”. It seems nice to have an excuse to do nothing, but I think it makes anyone conscientious a little nervous that someone is going to come along and judge them for malingering, which is rather stressful.

I didn’t really go into details there, but there are many ways to be active, rather than reactive, about being blocked (I think most would have said “proactive”, but I think I kind of hate that word for seeming bombastically redundant — but don’t mind me if you use it because I’m weirdly picky and fussy about words). Taking action not to be blocked has a variety of benefits: alleviates boredom, helps your company, boosts your reputation, opens up potential additional opportunities, etc. The way I see it, being blocked is something that you can almost always manage and opt out of. When I worked in retail many years ago, there was an adage of “if there’s time to lean, there’s time to clean.” I would say the equivalent in the world of software development is “if you’re blocked, you aren’t trying hard enough.”

Things to do when you’re blocked:

  1. Start a wiki or sharepoint site. There’s no company or domain out there that can’t use more documentation and information for onboarding and general reference. And these collaboration mechanisms are perfect since they’re designed to be imperfect at first and refined over the course of time.
  2. Get a subscription to pluralsight and polish your skills. Whether for the sake of the company or the sake of your own career, there’s no engineer that couldn’t use a few new useful tidbits.
  3. Get ahold of a backlog of defects or nettlesome issues for one of the pieces of software your group writes/maintains, create a playpen, and dive in. You’ll learn more about the code and you might even solve one or more issues that have plagued the team for a while.
  4. Identify a pain point for your fellow developers and do something about it. For instance, if merges constantly mess up a file in your code base, write a utility they can use to validate that file. It’s a lot more useful to the company than reading reddit or slashdot and it’ll boost your cred with your fellow developers as well (that is, help you pass “the second test“).
  5. Ask the people around you if they need a hand with anything. There are often people willing to offload a task or two, especially if it’s grunt work or if they’re stressed and you’re keeping yourself busy and earning some pennies in heaven doing this.
  6. Offer to go through an existing code base, adding or creating documentation for it. This has the useful dual purpose of improving documentation and helping you learn the code. When you know something well enough to explain it to others, you know it pretty well.
  7. Abstract, abstract, abstract. If it’s a development task, make an assumption about the info you’re waiting on, and code the rest of the system as if this assumption were true. Then code the system in such a way that changing your assumption is simple. For instance, don’t say “I can’t work until we decide what RDBMS to use — just write some kind of CRUD interface that your system uses with no implementation and go on your merry way.

I think that’s more than enough ammunition to ensure that you’ll always have some non-loafing task to do at the office. If you can find a situation where none of those things is an option, then my hat’s off to you or, perhaps more appropriately, my sympathy goes out to you because you probably need to find a new job. But maybe you can take steps to avoid being blocked in the first place. This list is a bit more abstract and a much less foolproof, but I’d suggest the following practices to avoid being blocked in general.

  1. Seek out situations where you have multiple assignments at once. This requires managing expectations and good organization and prioritization skills, but the end result is that you’ll have approved, productive work to fall back on even when waiting for answers.
  2. Cultivate a healthy knowledge of the problem domain you’re working on. In my experience, a lot of blocking results from needing someone to tell you what “Taking the EBE out of the PHG with the ERBD” means. The more domain knowledge you have, the more chance you have of deciphering cryptic acronyms and jargon code that prevents you from figuring out what to do next.
  3. Seek out areas in which you’re the main decision maker, however small they may be. I understand that you cant’t exactly promote yourself to VP of Engineering, but if you seek out being in charge of something, even if it’s just a small, low priority tool or something ancillary, you
    are unlikely to be truly blocked.
  4. Become resident expert in some technology, product, facet of the business or tool that matters. Generally people who are expects (e.g. the database expert or the source control expert) are in high demand and can fill any lulls with meetings and cooperative sessions with those seeking their expertise.

If you have other ways to avoid being blocked, I’d be interested to hear about them in the comments. I think avoiding blockages is critical not only for preserving your reputation, but preserving your sense of purpose and, on a long enough timeline, your engagement and work ethic. Don’t fall into the trap of checking out due to lack of stuff to do. Make sure you have stuff to do. And, if all else fails, move on. Or, to adapt an aphorism I’ve heard from enough places so as to be unsure of the original source, “change your work circumstances or… change your work circumstances.”

By

ASP Webforms Validation 101

Today I’m going to delve into a topic I don’t know a ton about in the hopes that someone who knows less than me will stumble onto it and find it helpful. As I’ve alluded to here and there, I’ve spent the last couple of months doing ASP Webforms development, which is something I’d never done before. I’ve picked up a handful of tips and tricks along the way. Today I’m going to walk through the basics of validation.

Start out by creating a new ASP.NET Webforms project:

This gives you an incredibly simple Webforms project that we can use for example purposes. If you open up the login form, you’ll find a boilerplate login form laid out in markup. Let’s take a look at the User Name controls:

<li>
    <asp:Label runat="server" AssociatedControlID="UserName">User name</asp:Label>
    <asp:TextBox runat="server" ID="UserName" />
    <asp:RequiredFieldValidator runat="server" ControlToValidate="UserName" CssClass="field-validation-error" ErrorMessage="The user name field is required." />
</li>

What we’ve got here is the most basic form control validator, the required field validator. The “ControlToValidate” attribute tells us that it’s going to validate the “UserName” text box and the “ErrorMessage” attribute contains what will be displayed if the validation fails (i.e. there is nothing in the field when you submit the form):

Pretty standard stuff. Let’s switch things up a little, though. Let’s add another label after the text box with something goofy like “is logging in.” This is something I bumped into early on as a layout issue and had to poke and google around for. Here’s the new look for the markup:

<li>
    <asp:Label runat="server" AssociatedControlID="UserName">User name</asp:Label>
    <asp:TextBox runat="server" ID="UserName" />
    <asp:RequiredFieldValidator runat="server" ControlToValidate="UserName" CssClass="field-validation-error" ErrorMessage="The user name field is required." />
    <asp:Label ID="PointlessVerb" runat="server">is logging in.</asp:Label>
</li>

And here’s what it looks like:

But that’s no good. We want this on the same line as the text box itself or it looks even goofier. Well, counterintuitive as it seemed to me, the validator field defaults to taking up the space that it would occupy if it were always visible. You can alter this behavior, however, by adding the attribute Display=”Dynamic” to the validator tag. Once you do this, the new label will appear on the same line–unless you mess up. Then the validator will resume taking up space, bumping the new label to the next line. Okay, okay, I’m no UX guru, but the important thing here is that you can set the space occupation behavior of your validators.

The next lesson I learned was that I could use validators for comparing values as well as doing typechecks. This tripped me up a bit too because I would have assumed that there was some kind of TypeCheckValidator, but this isn’t the case. Instead, you have to use CompareValidator. Let’s say that we want users to have to log in with a decimal representing a valid currency. (“Why,” you ask? Well, because we’re insane :) .) This is what it would look like:

<li>
    <asp:Label runat="server" AssociatedControlID="UserName">User name</asp:Label>
    <asp:TextBox runat="server" ID="UserName" />
    <asp:RequiredFieldValidator Display="Dynamic" runat="server" ControlToValidate="UserName" CssClass="field-validation-error" ErrorMessage="The user name field is required." />
    <asp:CompareValidator Display="Dynamic" ControlToValidate="UserName" Operator="DataTypeCheck" CssClass="field-validation-error"  Type="Currency" ErrorMessage="You must log in with a currency." runat="server"/>
    <asp:Label ID="PointlessVerb" runat="server">is logging in.</asp:Label>
</li>

The new validator shares some commonality with the existing one, but take special note of the “Operator” and “Type” attributes. Both of these fields are necessary. For anyone who has read my various rants about abstractions, would you care to guess why I found this completely unintuitive? Well, I don’t know about you, but I personally don’t tend to think of “DataTypeCheck” as an “Operator” (perhaps an “operation,” but even that seems like a stretch). I would have expected either a DataType validator or else the Compare validator simply to need the type specified, at which time it would do a type check. But, I digress.

The next sticking point that I encountered was that I had a particular form where I wanted to validate something that wasn’t part of a text box. I thought I was dead in the water and would have to do something sort of kludgy, but CustomValidator was exactly what I needed. Let’s say that I wanted to verify that the weird label I’d created does not, in fact, contain the word “is.” (This isn’t necessarily as silly as it sounds if there’s code that alters this label dynamically based on other inputs.) If I point any validator at this control as the “ControlToValidate”, I’ll get an exception saying that it cannot be validated. But I can omit that property and specify an event handler.

Add the custom validator to your markup and you get this:

<li>
    <asp:Label runat="server" AssociatedControlID="UserName">User name</asp:Label>
    <asp:TextBox runat="server" ID="UserName" />
    <asp:RequiredFieldValidator Display="Dynamic" runat="server" ControlToValidate="UserName" CssClass="field-validation-error" ErrorMessage="The user name field is required." />
    <asp:CompareValidator Display="Dynamic" ControlToValidate="UserName" Operator="DataTypeCheck" CssClass="field-validation-error" Type="Currency" ErrorMessage="You must log in with a currency." runat="server"/>
    <asp:CustomValidator ID="PointlessLabelValidator" Display="Static" CssClass="field-validation-error" OnServerValidate="PointlessLabelValidator_ServerValidate" ErrorMessage="You need the word is!" runat="server" />
    <asp:Label ID="PointlessVerb" runat="server">is logging in.</asp:Label>
</li>

And add the following to your code behind:

public void PointlessLabelValidator_ServerValidate(object sender, ServerValidateEventArgs e)
{
    var labelText =((Label)LoginControl.FindControl("PointlessVerb")).Text;
    e.IsValid = !labelText.Contains("is");
}

Now launch the web app again and type a number for the user name and something for the password and observe the new error message. It’s important to note here that you need to satisfy the other validation constraints because the custom validator operates a little differently. The validators we’ve added up until now work their magic by sending validation java script over the wire to the client and so validation is client-side and immediate. Here, we’re performing a server-side validation. This server-side custom validation will be short-circuited by client-side failures, which is why you need to fix those before seeing the new one.

And that’s my brief primer on validators. This is neither exhaustive nor the equivalent of a nice book written by a Webforms guru, but hopefully if you’re here it’s helped you figure out a few basics of Webforms validation.

By

Old Programmer’s Guide to Practical Maths

I’m writing this post to introduce another series of posts I intend to embark upon, to coexist alongside my “abstractions,” “design patterns,” and “home automation” posts. I’m going to post these under the category “Practical Math” and think of the series as being called “Practical Math for Programmers,” the title of this post notwithstanding. (Speaking of which, bonus Renaissance-Man points to the first commenter who knows what inspired the title.)

Does Math Matter?

This is a discussion that I’ve heard a lot over the years as a programmer, and it seems to be a slightly more muted and cordial debate than the typically contentious “computer science degree vs certifications” or “schooling versus self-taught” ones. Generally the accepted answer is the “architect answer” of “it depends.” I’ll see/hear consensuses such as, “well, you need it if you’re going to be doing really low level stuff or processing lots of data or designing compilers or something, but if you’re just writing basic line of business apps or if you’re more of a designer then you don’t need math.” I often like the architect answer because trending toward nuance and avoiding absolutes gives you the most chance of recovering from bad assumptions, but I’m going to go out on a limb here with more of a hard-line stance. I think that programmers do need and use math and that it doesn’t “depend” at all. It’s just that some are more trained in and aware of their usage of it while others do it by intuition, rote, or dependence on others.

One thing that I think clouds the issue is what is meant by “math.” Many think back to derivatives or perhaps back further to trigonometry and algebra and realize that they don’t frequently hit the math libraries for things like cosine or solve for x. Sure, there are some basic arithmetic operations when laying out a page–addition, subtraction, etc., but that’s it. There’s a broad tendency to dismiss the basic arithmetic as too trivial for discussion and declare math unnecessary in these cases.

But I think that this analysis tends to overlook two math concerns: missed opportunities to leverage math and lack of understanding that it’s even being used. For the first one, consider a scenario of laying out a form with simple arithmetic. Say there are three buttons that need to be evenly spaced on a form, and the form should have a fixed width, but you have some control over that. The visual, “non-math” way to do this might be to assign the buttons random horizontal locations and then move them around by trial and error. Time could be saved, however, with a simple algebraic solution of determining a ratio of padding to button, determining a form width, and solving for x–no trial and error and exactly right the first time.

The second concern is a bit more subtle. Perhaps each of the aforementioned buttons have handlers and those handlers have some logic. If button 2 is clicked after button 1, and button 3 is red or else button 2 is blue, disable button 1. Is there any math there, or just a series of weird procedures? The answer is “both,” but you’re probably not thinking of or looking for the math. You’re intuitively constructing logic chains in your head and trying to figure out where the nested parentheses are going to go in your if statement. If you’re intuitively constructing scenarios and you think you’re not doing any math, it’s because you’ve never been exposed to truth tables or Boolean algebra.

It Definitely Matters

Boolean logic lies at the core of most things that any programmer does, and it isn’t alone. Arithmetic is obviously indispensable, if reasonably taken for granted, but at the core or arithmetic lies the cousin of Boolean algebra, which is algebra over real numbers. Pulling back from the more “fluid” concerns about real numbers with which we’re used to dealing, there is also a whole world of discrete/finite math that involves distinct values and forms the basis for concepts like sets, graphs, and basic data structures which are in turn assembled into algorithms and design patterns. From that emerges everything from the exotic, like cryptography, to the mundane, such as preconditions and functions. If you’re a programmer and haven’t had exposure to these things, you’re living in The Matrix, blissfully unaware that you could take a red pill and start to see everything you’re doing as a weird, dripping series of green symbols and digits on black background.

Because I think that math matters a lot and because it’s fairly heavily in my background, I’m going to embark upon this series of posts. I’d like to cover things such as DeMorgan’s Laws/Boolean Inference, O-notation, formal methods, combinatorics, probability/game theory, etc. But if you find your eyes glazing over just reading that list, don’t worry–it won’t be a bumpy ride. And I’ll be in it with you, since my days of heavy exposure to theoretical math are, along with college and grad school, surprisingly far away in my rearview mirror. I’ll have some rust to shake off myself. These are going to be very elementary explanations of these concepts, along with practical concerns of why you as a programmer should care. I’ll also provide links to more information and formal reading along with perhaps examples of where you’re already using this and maybe not even realizing it. I’m going to keep these posts mainly language agnostic, and I might switch it up among C#, Java, and C++, depending on what I’m covering.

Stay tuned if this kind of thing interests you.

By

The Myth of Quick Copy-And-Paste Programming

Quick and Dirty?

“We’re really busy right now–just copy that customer form and do a few find and replaces. I know it’s not a great approach, but we don’t have time to build the new form from scratch.”

Ever heard that one? Ever said it? Once, twice, dozens, or hundreds of times? This has the conversational ring of, “I know I shouldn’t eat that brownie, but it’s okay. I’m planning to start exercising one of these days.” The subtext of the message is a willingness to settle–to trade self esteem for immediate gratification with the disclaimer “the outcome just isn’t important enough to me to do the hard thing.” If that sounds harsh to you when put in the context of programming instead of diet and exercise, there’s a reason: you’re used to fooling yourself into believing that copy-and-paste programming sacrifices good design to save time. But the reality is that the “time savings” is a voluntary self-delusion and that it’s really trading good design for being able to “program” mindlessly for a few minutes or hours. Considered in this light, it’s actually pretty similar to gorging yourself on chocolate while thinking that you’ll exercise later when you can afford that gym membership and whatnot: “I’ll space out and fantasize about my upcoming vacation now while I’m programming and clean up the mess when I come back, refreshed.”

I know what you’re thinking. You’re probably indignant right now because you’re thinking of a time that you copied a 50,000 line file and you changed only two things in it. Re-typing all 50,000 lines would have taken days, and you got the work done in minutes. The same number of minutes, in fact, that you’d have spent parameterizing those two differences and updating clients to call the new method, thus achieving good design and efficiency. Okay, so bad example.

Now, you’re thinking about the time that you copied that 50,000 line file and there were about 300 differences–no way you could easily have parameterized all of that. Only a copy, paste, and a bunch of find and replace could do the trick there. After that, you were up and running in about an hour. And everything worked. Oh, except for that one place where the text was a little different. You missed that one, but you found it after ten minutes of debugging. Oh, and except for five more of those at ten or fifteen minutes a pop. Oh, and then there was that twenty minutes you spent after the architect pointed out that a bunch of methods needed to be renamed because they made no sense named what they were in the new class. Then you were truly done. Except, oh, crap, runtime binding was failing with that other module since you changed those method names to please the stupid architect. That was a doozy because no one noticed it until QA a week later, and then you spent a whole day debugging it and another day fixing it. Oh, and then there was a crazy deadlock issue writing to the log file that some beta customer found three months later. As it turns out, you completely forgot that if the new and old code file methods executed in just the right interleaving, wackiness might ensue. Ugh, that took a week to reproduce and then another two weeks to figure out. Okay, okay, so maybe that was a bad example of the copy-and-paste time savings.

But you’re still mad at me. Maybe those weren’t the best examples, but all the other times you do it are good examples. You’re getting things done and cranking out code. You’re doing things that get you 80% of the way there and making it so that you only have to do 20% of the work, rather than doing all 100% from scratch. Every time you copy and paste, you save 80% of the manpower (minus, of course, the time spent changing the parts of the 80% that turned out not to be part of the 80% after all). The important point is that as long as you discount all of the things you miss while copying and pasting and all of the defects you introduce while doing it and all of the crushing technical debt you accrue while doing it and all of the downstream time fixing errors in several places, you’re saving a lot of time. I mean, it’s the same as how that brownie you were eating is actually pretty low in calories if you don’t count the flour, sugar, chocolate, butter, nuts, and oil. Come to think of it, you’re practically losing weight by eating it.

We Love What We Have

Hmm…apparently, it’s easy to view an activity as a net positive when you make it a point to ignore any negatives. And it’s also understandable. My flippant tone here is more for effect than it is meant to be a scathing indictment of people for cutting corners. There’s a bit of human psychology known as the Endowment Effect that explains a lot of this tendency. We have a cognitive bias to favor what we already have over what’s new, hypothetical, or in the possession of others.

Humans are loss averse (we feel the pain of a loss of an item more than we experience pleasure at acquiring the item, on average), and this leads to a situation in which we place higher economic value on things that we have than things that we don’t have. You may buy a tchotchke on vacation from a vendor for $10 and wouldn’t have paid a dollar more, yet when someone comes along and offers you $12 or $20 or even $30 for it, you suddenly get possessive and don’t want to part with it. This is the Endowment Effect in action.

What does this have to do with copying and pasting code? Well, if you substitute time/effort as your currency, there will be an innate cognitive bias toward adapting work that you’ve already done as opposed to creating some theoretical new implementation. In other words, you’re going to say, “This thing I already have is a source of more efficiency than whatever else someone might do.” This means that, assuming each would take equal time and that time is the primary motivator, you’re going to favor doing something with what you already have over implementing something new from scratch (in spite of how much we all love greenfield coding). Unfortunately, it’s both easy and common to conflate the Endowment Effect’s cognitive bias toward reuse with the sloppy practice of copy and paste.

At the point of having decided to adapt existing code to the new situation, things can break one of two ways. You can achieve this reuse by abstracting and factoring common logic, or you can achieve it by copy and paste. Once you’ve already decided on reuse versus blaze a new trail, the efficiency-as-currency version of the Endowment Effect has already played out in your mind–you’ve already, for better or for worse, opted to re-appropriate existing work. Now you’re just deciding between doing the favorable-but-hard thing or the sloppy-and-easy thing. This is why I said it’s more like opting to stuff your face with brownies for promises of later exercise than it is to save time at the expense of good design.

Think about it. Copy and paste is satisfying the way those empty calories are satisfying. Doing busy-work looks a lot like doing mentally taxing work and is often valued similarly in suboptimally incentivized environments. So, to copy and paste or not to copy and paste is often a matter of, “Do I want to listen to talk radio and space out while getting work done, or do I want to concentrate and bang my head against hard problems?” And if you do a real, honest self-evaluation, I’m pretty sure you’ll come to the same conclusion. Copying and pasting is the reality television of programming–completely devoid of meaningful substance in favor of predictable, mindless repetition.

In the end, you make two decisions when you go the copy-and-paste route. You decide to adapt what you’ve got rather than invent something new, and that’s where the substance of the time/planning/efficiency decision takes place. And once you’ve made the (perfectly fine) decision to use what you’ve got, the next decision is whether to work (factor into a common location) or tune out and malinger (copy and paste). In the end, they’re going to be a wash for time. The up front time saved by not thinking through the design is going to be washed out by the time wasted on the defects that coding without thinking introduces, and you’re going to be left with extra technical debt even after the wash for a net time negative. So, in the context of “quick and dirty,” the decision of whether new or reused is more efficient (“quick”) is in reality separated from and orthogonal to the decision of factor versus copy and paste (“dirty”). Next time someone tells you they’re going with the “quick and dirty” solution of copy and paste, tell them, “You’re not opting for quick and dirty–you’re just quickly opting for dirty.”

By

FluentSqlGenerator

A while back, I made a post about using string.Join() to construct SQL where clauses from collections of individual clauses. In that post, I alluded to playing with a “more sophisticated” where clause builder. I did just that here and there and decided to post the results to github. You can find at here if you want to check it out.

My implementation makes use of the Composite design pattern and the idea of well formed formula semantics in formal systems such as propositional and first order logic. The latter probably sounds a little stuffy and exposes my inner math geek, but that’s just a rigorous way of expressing the concept of building a statement from literals and basic operations on those literals. To pull it back yet another level of stuffiness, consider simple arithmetic, in which all of the following are valid expressions: “6”, “12 + 6″, “12 + (9 – 3)”. The first expression is an atomic literal and the second expression a binary operation. The third expression is interesting in that it shows that functions can have arguments that are literals or other expressions (if this still seems strange, think of these examples as “6”, “Add(12, 6)” and “Add(12, Subtract(9, 3))”.

Think of how this applies to the propositional semantics that make up SQL query where clauses. I can have “Column1 = 12″ or I can have “Column1 = 12 AND Column2 = 13″ or I can have “Column1 = 12 AND (Column2 = 13 OR Column3 = 4)”. When I want to model this concept in an object oriented sense, I need to represent the operators “AND” and “OR” as objects with two properties: left expression and right expression. I also need it to be possible that either of these properties is a “literal” of the form “col = val” or that it could be another expression as with the last example. Composite is thus a natural fit when you consider that these clauses are really expression trees, in a very real sense. So there is a “Component” base that’s abstract and then “Clause” and “Operation” objects that inherit from them and are fungible when constructing expressions.

This was the core of the implementation, but I also dressed it up a bit with some extension methods to support a discoverable, fluent interface, but optionally (I’m still very leery of this construct, but this seems like an appropriate and judicious use). Another nice feature, in my opinion is that it supports generic parameters so you don’t have massive overloads — you can set your columns equal to objects, strings, decimals, ints, etc. It makes heavy use of ToString() with these generic parameters, so use any type you please so long as what you want out of it is well represented by ToString().

A sample API is as follows:

var clause = Column.Named("Column1").IsEqualTo(123);
Console.WriteLine(clause);

clause = Column.Named("Column1").IsEqualTo(123).And(Column.Named("Column2").IsEqualTo("123456"));
Console.WriteLine(clause);

clause = Column.Named("Column1").IsOneOf(1, 2, 3).Or(Column.Named("Column2").IsGreaterThan(12));
Console.WriteLine(clause);

clause = Are.AnyOfTheseTrue(Column.Named("Column1").IsEqualTo(832), Column.Named("Column2").IsLessThan(25.30));
Console.WriteLine(clause);

clause = Are.AreAllOfTheseTrue(Column.Named("Column1").IsEqualTo(832), Column.Named("Column2").IsLessThan(25.30), Column.Named("Column3").IsOneOf("Current", "Valid"));
Console.WriteLine(clause);

Console.ReadLine();

Currently supported SQL operations include various comparison (equal, not equal, greater, less, etc) as well as “like” and “in()”. Expression operators “AND”, “OR” and “NOT” are supported. The utility is well covered by unit tests and a handful of integration tests too if you want to poke around but preserve functionality.

Feel free to download, use, fork, enhance, make fun of, etc, whatever. I’m not pretending this is a problem never before solved nor that this is the most elegant solution imaginable, but it was fun to write, code-kata style, and if someone can get some use out of it, great. If I wind up making significant modifications to it or extending it, I’ll post updates here as well as checking changes into github.

By

Casting is a Polymorphism Fail

Have you ever seen code that looked like the snippet here?

public class Menagerie
{
    private List<Animal> _animals = new List<Animal>();

    public void AddAnimal(Animal animal)
    {
        _animals.Add(animal);
    }

    public void MakeNoise()
    {
        foreach (var animal in _animals)
        {
            if (animal is Cat)
                ((Cat)animal).Meow();
            else if (animal is Dog)
                ((Dog)animal).Bark();
        }
    }
}

You probably have seen code like this, and I hope that it makes you sad. I know it makes me sad. It makes me sad because it’s clearly the result of a fundamental failure to understand (or at least implement) polymorphism. Code written like this follows an inheritance structure, but it completely misses the point of that structure, which is the ability to do this instead:

public class Menagerie
{
    private List<Animal> _animals = new List<Animal>();

    public void AddAnimal(Animal animal)
    {
        _animals.Add(animal);
    }

    public void MakeNoise()
    {
        foreach (var animal in _animals)
            animal.MakeNoise();
    }
}

What’s so great about this? Well, consider what happens if I want to add “Bird” or “Bear” to the mix. In the first example with casting, I have to add a class for my new animal, and then I have to crack open the menagerie class and add code to the MakeNoise() method that figures out how to tell my new animal to make noise. In the second example, I simply have to add the class and override the base class’s MakeNoise() method and Menagerie will ‘magically’ work without any source code changes. This is a powerful step toward the open/closed principle and the real spirit of polymorphism — the ability to add functionality to a system with a minimum amount of upheaval.

But what about more subtle instances of casting? Take the iconic:

public void HandleButtonClicked(object sender, EventArgs e)
{
    var button = (Button)sender;
    button.Content = "I was clicked!";
}

Is this a polymorphism failure? It can’t be, can it? I mean, this is the pattern for event subscription/handling laid out by Microsoft in the C# programming guide. Surely those guys know what they’re doing.

As a matter of fact, I firmly believe that they do know what they’re doing, but I also believe that this pattern was conceived of and created many moons ago, before the language had some of the constructs that it currently does (like generics and various frameworks) and followed some of the patterns that it currently does. I can’t claim with any authority that the designers of this pattern would ask for a mulligan knowing what they do now, but I can say that patterns like this, especially ones that become near-universal conventions, tend to build up quite a head of steam. That is to say, if we suddenly started writing even handlers with strongly typed senders, a lot of event producing code simply wouldn’t work with what we were doing.

So I contend that it is a polymorphism failure and that casting, in general, should be avoided as much as possible. However, I feel odd going against a Microsoft standard in a language designed by Microsoft. Let’s bring in an expert on the matter. Eric Lippert, principal developer on the C# compiler team, had this to say in a stack overflow post:

Both kinds of casts are red flags. The first kind of cast raises the question “why exactly is it that the developer knows something that the compiler doesn’t?” If you are in that situation then the better thing to do is usually to change the program so that the compiler does have a handle on reality. Then you don’t need the cast; the analysis is done at compile time.

The “first kind” of cast he’s referring to is one he defines earlier in his post as one where the developer “[knows] the runtime type of this expression but the compiler does not know it.” That is the kind that I’m discussing here, which is why I chose that specific portion of his post. In our case, the developer knows that “sender” is a button but the compiler does not know that. Eric’s point, and one with which I wholeheartedly agree, is “why doesn’t the compiler know it and why don’t we do our best to make that happen?” It just seems like a bad idea to run a reality deficit between yourself and the compiler as you go. I mean, I know that the sender is a button. You know the sender is a button. The method knows the sender is a button (if we take its name, containing “ButtonClicked” at face value). Maintainers know the sender is a button. Why does everyone know sender is a button except for the compiler, who has to be explicitly and awkwardly informed in spite of being the most knowledgeable and important party in this whole situation?

But I roll all of this into a broader point about a polymorphic approach in general. If we think of types as hierarchical (inheritance) or composed (interface implementation), then there’s some exact type that suits my needs. There may be more than one, but there will be a best one. When writing a method and accepting parameters, I should accept as general a type as possible without needing to cast so that I can be of the most service. When returning something, I should be as specific as possible to give clients the most options. But when I talk about “possible” I’m talking about not casting.

If I start casting, I introduce error possibilities, but I also necessarily introduce a situation where I’m treating an object as two different things in the same scope. This isn’t just jarring from a readability perspective — it’s a maintenance problem. Polymorphism allows me to care only about some public interface specification and not implementation details — as long as the thing I get has the public API I need, I don’t really care about any details. But as soon as I have to understand enough about an object to understand that it’s actually a different object masquerading as the one I want, polymorphism is right out the window and I suddenly depend on knowing the intricate relationship details of the class in question. Now I break not only if my direct collaborators change, but also if some inheritance hierarchy or interface hierarchy I’m not even aware of changes.

The reason I’m posting all of this isn’t to suggest that casting should never happen. Clearly sometimes it’s necessary, particularly if it’s forced on you by some API or framework. My hope though is that you’ll look at it with more suspicion — as a “red flag”, in the words of Eric Lippert. Are you casting because it’s forced on you by external factors, or are you casting to communicate with the compiler? Because if it’s the latter, there are other, better ways to achieve the desired effect that will leave your code more elegant, understandable, and maintainable.

Acknowledgements | Contact | About | Social Media