Stories about Software


Using NDepend to Make You a Better Programmer

This is another post that I originally wrote for the NDepend blog. If you haven’t yet, go check out the NDepend blog and sign up for the RSS feed. It’s relatively new, but we’ll have a lot of good content there for you.

If you’re a software developer, particularly of the newly minted variety, the concept of static analysis might not seem approachable.  It sounds academic.  It sounds architect-y.  It sounds complicated.  I’ve seen this reaction from a lot of people in my career and I think that’s too bad.

If you delve into its complex depths, static analysis can be any and all of these things, but with the developers I mentor and coach, I like to introduce it as a game that makes you better at what you do.  You can use static analysis to give yourself feedback about your code that is both fast and anonymous, allowing you to improve via trial and error, rather than by soliciting feedback from people much more tenured than you and sometimes wincing as they lay into you a little.  And, perhaps best of all, you can calibrate the quality of your code with the broader development world, rather than just pleasing the guy who has hung around your company long enough to default his way into the “tech lead” role.

NDepend Rules

Take a look at some of the feedback that NDepend offers about your code.  “That method is too big” isn’t particularly intimidating, is it?  I mean, you might wonder at what you could do to compact a method, but it’s not some kind of esoteric rule written in gibberish.  You run NDepend on your code and you can see that there is some number of methods that the broader development community considers to be “too big.”

From there, you can start looking at ways to write smaller methods and to refactor some of your current ones to sneak in under the warning number.  This is the essence of gamification — you change the way you write code to get rid of the warnings.  You get better.  And it’s gratifying.

As you do this, another interesting thing starts to happen.  You start noticing that other developers continue to write large methods and when you run NDepend on their code, they light up the console with errors, whereas you do not with your code.  And so, you can have conversations with them that start with, “you know, this static analysis tool I’ve been using wants us to have smaller methods, and I’ve been working a lot on that, if you ever want a hand.”

You gain a reputation as being knowledgeable.  Before you know it, you can cite widely accepted static analysis rules and the design goals they imply.  You know these rules, and, via gamification, you have experience molding code to comply with them.  Even in cases where you might wind up overruled by the local team lead or architect, it’s no longer a simple matter of that person saying, “because I said so,” and just ending the conversation.  They have to engage with you and present cogent counter-arguments to your points.  You’re participating in important discussions in ways that you never have before.

If it sounds like I’m speaking from experience, I am.  Throughout my career, I’ve been relentless about figuring out ways to improve my craft, always trying to be a better programmer.  Early on, I was unsatisfied with a lot of arguments among developers around me that I knew boiled down to nothing more than personal preference, so I went out in search of empirical methods and broader knowledge, and that search brought me to static analysis.  I read about data and science behind particular choices in approaching software, and I schooled myself to adopt the approaches that had brought the best results.

Somewhere along that journey, I discovered NDepend and its effect on my approach to writing code was profound.  My methods shrank and became less complicated.  My architectural and design skills improved as I made it a point to avoid dependency cycles and needless coupling.  I boosted unit test coverage and learned well established language practices.  It was not long before people routinely asked me for design advice and code reviews.  And from there, it wasn’t long before I occupied actual lead and architect roles.

So, if you want to improve your craft and nudge your career along, don’t pass on static analysis, and don’t pass on NDepend.  NDepend is not just a tool for architects; it’s a tool for creating architects from the ranks of developers.  You’ll up your game, improve your craft, and even have some fun doing it.


Your Code Is Data

This is a post that I originally wrote for the NDepend blog. If you haven’t already, go check it out! We’re building out some good content over there around static analysis, with lots more to follow.

A lot of programmers have some idea of what static analysis is, as least superficially.  If I mention the term, what pops into your head?  Automatic enforcement of coding standards?  StyleCop or FXCop?  Cyclomatic complexity and Visual Studio’s “maintainability index?”  Maybe you’re deeply familiar with all of the subtleties and nuances of the technique.

Whatever your level of familiarity, I’d like to throw what might be a bit of a curve ball at you.  Static analysis is the idea of analyzing source code and byte code for various properties and reporting on those properties, but it’s also, philosophically, the idea of treating code as data.  This is deeply weird to us as application developers, since we’re very much used to thinking of source code as instructions, procedures, and algorithms.  But it’s also deeply powerful.


When you think of source code this way, typical static analysis use cases make sense.  FXCop asks questions along the lines of “How many private fields not prepended with underscores,” or, perhaps, “SELECT COUNT(class_field) FROM classes WHERE class_field NOT LIKE ‘_*’”  More design-focused source code analysis tools ask questions like “What is the cyclomatic complexity of my methods,” or, perhaps, “SELECT cyclomatic_complexity FROM Methods.”

But if code is data, and static analysis tools are sets of queries against that data, doesn’t it seem strange that we can’t put together and execute ad-hoc queries the way that you would with a relational (or other) database?  I mean, imagine if you built out some persistence store using SQL Server, and the only queries you were allowed were SELECT * from the various tables and a handful of others.  Anything beyond that, and you would have to inspect the data manually and make notes by hand.  That would seem arbitrarily and even criminally restrictive.  So why doesn’t it seem that way with our source code?  Why are we content not having the ability to execute arbitrary queries?

I say “we” but the reality is that I can’t include myself in that question, since I have that ability and I would consider having it taken away from me to be crippling.  My background is that of a software architect, but beyond that, I’m also a software craftsmanship coach, teacher, and frequent analyzer of codebases in a professional capacity, auditing a wide variety of them for various properties, characteristics, and trends.  If I couldn’t perform ad-hoc, situation-dependent queries against the source code, I would be far less effective in these roles.

My tools of choice for doing this are NDepend and its cousin JArchitect (for Java code bases).  Out of the box, they’re standard static analysis and architecture tools, but they also offer this incredibly powerful concept called CQLinq that is, for all intents and purposes, SQL for the ‘schema’ of source code.  In reality, CQLinq is actually a Linq provider for writing declarative code queries, but anyone that knows SQL (or functional programming or lamba expressions) will feel quite at home creating queries.

Let’s say, for instance, that you’re the architect for a C# code base and you notice a disturbing trend wherein the developers have taken to communicating between classes using global variables.  What course of action would you take to nip this in the bud?  I bet it would be something annoying for both you and them.  Perhaps you’d set a policy for a while where you audited literally every commit and read through to make sure they weren’t doing it.  Maybe you’d be too pressed for time and you’d appoint designated globals cops.  Or, perhaps you’d just send out a lot of angry, threatening emails?

Do you know what I would do?  I’d just write a single CQLinq query and add it to a step in my automated team build that executed static analysis code rules against all commits.  If the count of global variable invocations in the code base was greater after the commit than before it, the build would fail.  No need for anger, emails or time wasted checking over people’s shoulders, metaphorically or literally.

Want to see how easy a query like this would be to write?  Why don’t I show you…

That’s it. I write that query, set the build to run NDepend’s static analysis, and fail if there are warnings. No more sending out emails, pleading, nagging, threatening, wheedling, coaxing, or bottleneck code reviewing. And, most important of all, no more doing all of that and having problems anyway. One simple little piece of code, and you can totally automate preventing badness. And best of all, the developers get quick feedback and learn on their own.

As I’ve said, code is data at its core.  This is especially true if you’re an architect, responsible for the long term health of the code base.  You need to be able to assess characteristics and properties of that code, make decisions about it, and set precedent.  To accomplish this, you need powerful tooling for querying your code, and NDepend, with its CQLinq, provides exactly that.


Let’s Make Better Code Metrics

A few months back, I wrote a post about changes to my site and work.  Today, I have another announcement in that same vein:  I’ve recently partnered with NDepend to help start and create content for their blog.  If you go there now, you can see the maiden post which announces the release of the newest version of NDepend (not written by me, personally, if you were wondering, though some of mine will follow).

What is NDepend?

In the broadest terms, NDepend is a static analysis tool.  More specifically and colloquially, you might think of NDepend as Jiminy Cricket, if Pinocchio were a software developer or architect.  It’s extremely helpful for visualizing the dependencies and properties of your code base (e.g. complexity, coupling, etc), which will give you a leg up on your fellow developers right out of the gate.  It’s also incredibly informative, furnishing you not only with detailed, quantitative metrics about your code, but also indicating where you’re deviating from what is broadly considered to be good programming technique.  And, of course, you can do a great deal of customization, from integrating this feedback into your build to tracking code quality over time to building and defining your own complex, custom rules.

To put it more succinctly, in a world where developers are trying to distinguish themselves in terms of knowledge and chops, NDepend will give you such a huge advantage that it’s probably unfair to everyone that doesn’t have it.  I personally learned a ton about software architecture just from installing, using, and exploring this tool over the course of 5 years or so.  If you want to learn more about NDepend and static analysis in general, check out my Pluralsight course about it that I published in conjunction with the last major version. (If you don’t have a Pluralsight subscription but want to check it out, sign up for my mailing list using the form at the right).

Scientist Read More


Fun with ILDASM: Extension vs Static Methods

I recently had a comment on a very old blog post that led to a conversation between the commenter and me. During the conversation, I made the comment that extension methods in C# are just syntactic sugar over plain ol’ static methods. He asked me to do a post exploring this in further detail, which is what I’m doing today. However, there isn’t a ton more to explain in this regard anymore than explaining that (2 + 2) and 4 are representations of the same concept. So, I thought I’d “show, don’t tell” and, in doing so, introduce you to the useful concept of using a disassembler to allow you to examine .NET’s IL byte code.

I won’t go into a ton of detail about virtual machines and just in time compilation or anything, but I will describe just enough for the purpose of this post. When you write C# code and perform a build, the compiler turns this into what’s called “IL” (intermediate language). Java works in the same way, and its intermediate product is generally referred to as byte code. When an IL executable is executed, the .NET framework is running, and this intermediate code is compiled, on the fly, into machine code. So what you have with both .NET in Java is a 2 stage compilation process: source code to intermediate code, and intermediate code to machine code.

Where things get interesting for our purposes is that the disassembled intermediate code is legible and you can work with it, reading it or even writing it directly (which is how AOP IL-Weaving tools like Postsharp work their magic). What gets even more interesting is that all of the .NET languages (C#, VB, F#, etc) compile to the same IL code and are treated the same by the framework when compiled into machine code. This is what is meant by languages “targeting the .NET framework” — there is a compiler that resolves these languages into .NET IL. If you’ve ever wondered why it’s possible to have things like “Iron Python” that can exist in the .NET ecosystem, this is why. Someone has written code that will parse Python source code and generate .NET IL (you’ll also see this idea referred to as “Common Language Infrastructure” or CLI).

Anyway, what better way to look at the differences or lack thereof between static methods and extension methods. Let’s write them and see what the IL looks like! But, in order to do that, we need to do a little prep work first. We’re going to need easy access to a tool that can read .NET exe and dll files and produce the assembly in a readable, text-file form. So, here’s what we’re going to do.

  1. In Visual Studio, go to Tools->External Tools.
  2. Click “Add” and you will be prompted to fill out the text boxes below.
  3. Fill them out as shown here (you may have to search for ILDASM.exe, but it should be in Microsoft SDKs under Program Files (x86):ILDASM
  4. Click “Apply.”  ILDASM will now appear as a menu option in the Tools menu.

Now, let’s get to work.  I’m going to create a new project that’s as simple as can be. It’s a class library with one class called “Adder.” Here’s the code:

Let no one accuse me of code bloat! That’s it. That’s the only class/method in the solution. So, let’s run ILDASM on it and see what happens. To do that, select “ILDASM” from the Tools menu, and it will launch a window with nothing in it. Go to “File->Open” (or Ctrl-O) and it will launch you in your project’s output directory. (This is why I had you add “$(TargetDir)” in the external tools window. Click the DLL, and you’ll be treated to a hierarchical makeup of your assembly, as shown here:


So, let’s see what the method looks like in IL Code (just double click it):

Alright… thinking back to my days using assembly language, this looks vaguely familiar. Load the arguments into registers or something, add them, you get the gist.

So, let’s see what happens when we change the source code to use an extension method. Here’s the new code:

Note, the only difference is the addition of “this” before “int first.” That is what turns this into an extension method and alters the calling semantics (though you can still call extension methods the same way you would normal static methods).

So, let’s see what the IL code looks like for that:

The only difference between this and the plain static version is the presence of the line:

The “this” keyword results in the generation of this attribute, and its purpose is to allow the compiler to flag it as an extension method. (For more on this, see this old post from Scott Hanselman: How do Extension Methods work and why was a new CLR not required?). The actual substance of the method is completely identical.

So, there you have it. As far as the compiler is concerned, the difference between static and extension methods is “extension methods are static methods with an extension method attribute.” Now, I could go into my opinion on which should be used, when, and how, but it would be just that: my opinion. And your mileage and opinion may vary. The fact of the matter is that, from the compiler’s perspective, they’re the same, so when and how you use one versus the other is really just a matter of your team’s preferences, comfort levels, and ideas about readability.


How To Put Your Favorite Source Code Goodies on Nuget

A while back, I made a post encouraging people to get fed up every now and then and figure out a better way of doing something. Well, tonight I take my own advice. I am sick and tired of rifling through old projects to find code that I copy and paste into literally every non-trivial .NET solution that I create. There’s a thing for this, and it’s called Nuget. I use it all the time to consume other people’s code, libraries and utilities, but not my own. Nope, for my own, I copy and paste stuff from other projects. Not anymore. This ends now.

My mission tonight is to take a simple bit of code that I add to all my unit test projects and to make it publicly available view Nuget. Below is the code. Pretty straightforward and unremarkable. For about 5 versions of MSTest, I’ve hated the “ExpectedException” attribute for testing that something throws an exception. It’s imprecise. All it tests is that somewhere, anywhere, in the course of execution, an exception of the type in question is thrown. Could be on the first line of the method, could be on the last, could happen in the middle from something nested 8 calls deep in the call stack. Who knows? Well, I want to know and be precise, so here’s what I do instead:

Now, let’s put this on Nuget somehow. I found my way to this link, with instructions. Having no idea what I’m doing (though I did play with this once, maybe a year and a half ago), I’m going with the GUI option even though there’s also a command line option. So, I downloaded the installer and installed the Nuget package explorer.

From there, I followed the link’s instructions, more or less. I edited the package meta data to include version info, ID, author info, and a description. Then, I started to play around with the “Framework Assemblies” section, but abandoned that after a moment. Instead, I went up to Content->Add->Existing file and added ExtendedAssert. Once I saw the source code pop up, I was pretty content (sorry about the little Grindstone timer in the screenshot — didn’t notice ’til it was too late):


Next up, I ran Tools->Analyze Package. No issues found. Not too shabby for someone with no idea what he’s doing! Now, to go for the gusto — let’s publish this sucker. File->Publish and, drumroll please…. ruh roh. I need something called a “Publish Key” to publish it to nuget.org.


But, as it turns out, getting an API key is simple. Just sign up at nuget.org and you get one. I used my Microsoft account to sign up. I uploaded my DaedTech logo for the profile picture and tweaked a few settings and got my very own API key (found by clicking on my account name under the “search packages” text box at the top). There was even a little clipboard logo next to it for handy copying, and I copied it into the window shown above, and, viola! After about 20 seconds, the publish was successful. I’d show you a screenshot, but I’m not sure if I’m supposed to keep the API key a secret. Better safe than sorry. Actually, belay that last thought — you are supposed to keep it a secret. If you click on “More Info” under your API key, it says, and I quote:

Your API key provides you with a token that identifies you to the gallery. Keep this a secret. You can always regenerate your key at any time (invalidating previous keys) if your token is accidentally revealed.

Emphasis mine — turns out my instinct was right. And, sorry for the freewheeling nature of this post, but I’m literally figuring this stuff out as I type, and I thought it might make for an interesting read to see how someone else pokes around at this kind of experimenting.

Okay, now to see if I can actually get that thing. I’m going to create a brand new test project in Visual Studio and see if I can install my beloved ExtendedAssert through Nuget, now.


Holy crap, awesome! I’m famous! (Actually, that was so easy that I kind of feel guilty — I thought it’d be some kind of battle, like publishing a phone app or something). But, the moment of truth was a little less exciting. I installed the package, and it really didn’t do anything. My source code file didn’t appear. Hmmm…

After a bit of googling, I found this stack overflow question. Let’s give that a try, optimistically upvoting the question and accepted answer before I forget. I right clicked in the “package contents” window, added a content folder, and then dragged ExtendedAssert into that folder. In order to re-publish, I had to rev the version number, so I revved the patch decimal, since this is a hot patch to cover an embarrassing release if I’ve ever seen one. No time for testing on my machine or a staging environment — let’s slam this baby right into production!

Woohoo! It worked and compiled! Check it out:


But, there’s still one sort of embarrassing problem — V1.0.1 has the namespace from whichever project I picked rather than the default namespace for the assembly. That’s kind of awkward. Let’s go back to google and see about tidying that up. First hit was promising. I’m going to try replacing the namespace with a “source code transformation” as shown here:


Then, according to the link, I also need to change the filename to ExtendedAssert.cs.pp (this took me another publish to figure out that I won’t bore you with). Let’s rev again and go into production. Jackpot! Don’t believe me? Go grab it yourself.

The Lessons Here

A few things I’ll note at this point. First off, I recall that it’s possible to save these packages locally and for me to try them before I push to Nuget. I should definitely have done that, so there’s a meta-lesson here in that I fell into the classic newbie trap of thinking “oh, this is simple and it’ll just work, so I’ll push it to the server.” I’m three patches in and it’s finally working. Glad I don’t have tens of thousands of users for this thing.

But the biggest thing to take away from this is that Nuget is really easy. I had no idea what I was doing and within an hour I had a package up. For the last 5 years or so, every time I start a new project, I’d shuffle around on the machine to find another ExtendedAssert.cs that I could copy into the new project. If it’s a new machine, I’d email it to myself. A new job? Have a coworker at the old one email it to me. Sheesh, barbaric. And I put up with it for years, but not anymore. Given how simple this is, I’m going to start making little Nuget packages for all my miscellaneous source code goodies that I transport with me from project to project. I encourage you to do the same.


Creating a Word Document from Code with Spire

I’d like to tell you a harrowing, cautionary tale of my experience with the MS Office Interop libraries and then turn it into a story of redemption. Just to set the stage, these interop libraries are basically a way of programatically creating and modifying MS Office files such as Word documents and Excel spreadsheets. The intended usage of these libraries is in a desktop environment from a given user account in the user space. The reason for this is that what they actually do is launch MS Word and start piping commands to it, telling it what to do to the current document. This legacy approach works reasonably well, albeit pretty awkwardly from a user account. But what happens when you want to go from a legacy Winforms app to a legacy Webforms app and do this on a web server?

Microsoft has the following to say:

Microsoft does not currently recommend, and does not support, Automation of Microsoft Office applications from any unattended, non-interactive client application or component (including ASP, ASP.NET, DCOM, and NT Services), because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.

Microsoft says, “yikes, don’t do that, and if you do, caveat emptor.” And, that makes sense. It’s not a great idea to allow service processes to communicate directly with Office documents anyway because of the ability to embed executable code in them.

Sometime back, I inherited an ecosystem of legacy Winforms and Webforms applications and one common thread was the use of these Interop libraries in both places. Presumably, the original author wasn’t aware of Microsoft’s stance on this topic and had gone ahead with using Interop on the web server, getting it working for the moment. I didn’t touch this legacy code since it wasn’t causing any issues, but one day a server update came down the pipeline and *poof* no more functioning Interop. This functionality was fairly important to people, so my team was left to do some scrambling to re-implement the functionality using PDF instead of MS Word. It was all good after a few weeks, but it was a stressful few weeks and I developed battle scars around not only doing things with those Interop libraries and their clunky API (see below) but with automating anything with Office at all. Use SSRS or generate a PDF or something. Anything but Word!


But recently I was contacted by E-iceblue, who makes document management and conversion software in the .NET space. They asked if I’d take a look at their offering and write-up my thoughts on it. I agreed, as I do agree to requests like this from time to time, but always with the caveat that I’ll write about my experience in earnest and not serve as a platform for a print-based commercial. Given my Interop horror story, the first thing I asked was whether the libraries could work on a server or not, and I was told that they could (presumably, although I haven’t verified, this is because they use the Open XML format rather than the legacy Interop paradigm). So, that was already a win.

I put this request in my back pocket for a bit because I’m already pretty back-logged with post requests and other assorted work, but I wound up having a great chance to try it out. I have a project up on Github that I’ve been pushing code to in order to help with my Pluralsight course authorship. Gist of it is that I create a directory and file structure for my courses as I work on them, and then another for submission, and I want to automate the busy-work. And one thing that I do for every module of every course is create a power point document and a word document for scripting. So, serendipity — I had a reason to generate word documents and thus to try out E-iceblue’s product, Spire.

Rather than a long post with screenshots and all of that, I did a video capture. I want to stress that what you’re seeing here is me, having no prior knowledge of the product at all, and armed only with a link to tutorials that my contact there sent to me. Take a look at the video and see what it’s like. I spend about 4-5 minutes getting setup and, at the end of it, I’m using a nice, clean API to successfully generate a Word document.

I’ll probably have some more posts in the hopper with this as I start doing more things with it (Power Point, more complex Word interaction, conversion, etc). Early returns on this suggest it’s worth checking out, and, as you can see, the barriers to entry are quite low, and I’ve barely scratched the surface of just one line of offerings.


Introduction to Static Analysis (A Teaser for NDepend)

Rather than the traditional lecture approach of providing an official definition and then discussing the subject in more detail, I’m going to show you what static analysis is and then define it. Take a look at the following code and think for a second about what you see. What’s going to happen when we run this code?

Well, let’s take a look:


I bet you saw this coming. In a program that does nothing but set x to 1, and then throw an exception if x is 1, it isn’t hard to figure out that the result of running it will be an unhandled exception. What you just did there was static analysis.

Static analysis comes in many shapes and sizes. When you simply inspect your code and reason about what it will do, you are performing static analysis. When you submit your code to a peer to have her review, she does the same thing. Like you and your peer, compilers perform static analysis, though automated analysis instead of manual. They check the code for syntax errors or linking errors that would guarantee failures, and they will also provide warnings about potential problems such as unreachable code or assignment instead of evaluation. Products also exist that will check your source code for certain characteristics and stylistic guideline conformance rather than worrying about what happens at runtime and, in managed languages, products exist that will analyze your compiled IL or byte code and check for certain characteristics. The common thread here is that all of these examples of static analysis involve analyzing your code without actually executing it.

Analysis vs Reactionary Inspection

People’s interactions with their code tend to gravitate away from analysis. Whether it’s unit tests and TDD, integration tests, or simply running the application to see what happens, programmers tend to run experiments with their code and then to see what happens. This is known as a feedback loop, and programmers use the feedback to guide what they’re going to do next. While obviously some thought is given to what impact changes to the code will have, the natural tendency is to adopt an “I’ll believe it when I see it” mentality.

We tend to ask “what happened?” and we tend to orient our code in such ways as to give ourselves answers to that question. In this code sample, if we want to know what happened, we execute the program and see what prints. This is the opposite of static analysis in that nobody is trying to reason about what will happen ahead of time, but rather the goal is to do it, see what the outcome is, and then react as needed to continue.

Reactionary inspection comes in a variety of forms, such as debugging, examining log files, observing the behavior of a GUI, etc.

Static vs Dynamic Analysis

The conclusions and decisions that arise from the reactionary inspection question of “what happened” are known as dynamic analysis. Dynamic analysis is, more formally, inspection of the behavior of a running system. This means that it is an analysis of characteristics of the program that include things like how much memory it consumes, how reliably it runs, how much data it pulls from the database, and generally whether it correctly satisfies the requirements are not.

Assuming that static analysis of a system is taking place at all, dynamic analysis takes over where static analysis is not sufficient. This includes situations where unpredictable externalities such as user inputs or hardware interrupts are involved. It also involves situations where static analysis is simply not computationally feasible, such as in any system of real complexity.

As a result, the interplay between static analysis and dynamic analysis tends to be that static analysis is a first line of defense designed to catch obvious problems early. Besides that, it also functions as a canary in the mine to detect so-called “code smells.” A code smell is a piece of code that is often, but not necessarily, indicative of a problem. Static analysis can thus be used as an early detection system for obvious or likely problems, and dynamic analysis has to be sufficient for the rest.


Source Code Parsing vs. Compile-Time Analysis

As I alluded to in the “static analysis in broad terms” section, not all static analysis is created equal. There are types of static analysis that rely on simple inspection of the source code. These include the manual source code analysis techniques such as reasoning about your own code or doing code review activities. They also include tools such as StyleCop that simply parse the source code and make simple assertions about it to provide feedback. For instance, it might read a code file containing the word “class” and see that the next word after it is not capitalized and return a warning that class names should be capitalized.

This stands in contrast to what I’ll call compile time analysis. The difference is that this form of analysis requires an encyclopedic understanding of how the compiler behaves or else the ability to analyze the compiled product. This set of options obviously includes the compiler which will fail on show stopper problems and generate helpful warning information as well. It also includes enhanced rules engines that understand the rules of the compiler and can use this to infer a larger set of warnings and potential problems than those that come out of the box with the compiler. Beyond that is a set of IDE plugins that perform asynchronous compilation and offer realtime feedback about possible problems. Examples of this in the .NET world include Resharper and CodeRush. And finally, there are analysis tools that look at the compiled assembly outputs and give feedback based on them. NDepend is an example of this, though it includes other approaches mentioned here as well.

The important compare-contrast point to understand here is that source analysis is easier to understand conceptually and generally faster while compile-time analysis is more resource intensive and generally more thorough.

The Types of Static Analysis

So far I’ve compared static analysis to dynamic and ex post facto analysis and I’ve compared mechanisms for how static analysis is conducted. Let’s now take a look at some different kinds of static analysis from the perspective of their goals. This list is not necessarily exhaustive, but rather a general categorization of the different types of static analysis with which I’ve worked.

  • Style checking is examining source code to see if it conforms to cosmetic code standards
  • Best Practices checking is examining the code to see if it conforms to commonly accepted coding practices. This might include things like not using goto statements or not having empty catch blocks
  • Contract programming is the enforcement of preconditions, invariants and postconditions
  • Issue/Bug alert is static analysis designed to detect likely mistakes or error conditions
  • Verification is an attempt to prove that the program is behaving according to specifications
  • Fact finding is analysis that lets you retrieve statistical information about your application’s code and architecture

There are many tools out there that provide functionality for one or more of these, but NDepend provides perhaps the most comprehensive support across the board for different static analysis goals of any .NET tool out there. You will thus get to see in-depth examples of many of these, particularly the fact finding and issue alerting types of analysis.

A Quick Overview of Some Example Metrics

Up to this point, I’ve talked a lot in generalities, so let’s look at some actual examples of things that you might learn from static analysis about your code base. The actual questions you could ask and answer are pretty much endless, so this is intended just to give you a sample of what you can know.

  • Is every class and method in the code base in Pascal case?
  • Are there any potential null dereferences of parameters in the code?
  • Are there instances of copy and paste programming?
  • What is the average number of lines of code per class? Per method?
  • How loosely or tightly coupled is the architecture?
  • What classes would be the most risky to change?

Believe it or not, it is quite possible to answer all of these questions without compiling or manually inspecting your code in time consuming fashion. There are plenty of tools out there that can offer answers to some questions like this that you might have, but in my experience, none can answer as many, in as much depth, and with as much customizability as NDepend.

Why Do This?

So all that being said, is this worth doing? Why should you watch the subsequent modules if you aren’t convinced that this is something that’s even worth learning. It’s a valid concern, but I assure you that it is most definitely worth doing.

  • The later you find an issue, typically, the more expensive it is to fix. Catching a mistake seconds after you make it, as with a typo, is as cheap as it gets. Having QA catch it a few weeks after the fact means that you have to remember what was going on, find it in the debugger, and then figure out how to fix it, which means more time and cost. Fixing an issue that’s blowing up in production costs time and effort, but also business and reputation. So anything that exposes issues earlier saves the business money, and static analysis is all about helping you find issues, or at least potential issues, as early as possible.
  • But beyond just allowing you to catch mistakes earlier, static analysis actually reduces the number of mistakes that happen in the first place. The reason for this is that static analysis helps developers discover mistakes right after making them, which reinforces cause and effect a lot better. The end result? They learn faster not to make the mistakes they’d been making, causing fewer errors overall.
  • Another important benefit is that maintenance of code becomes easier. By alerting you to the presence of “code smells,” static analysis tools are giving you feedback as to which areas of your code are difficult to maintain, brittle, and generally problematic. With this information laid bare and easily accessible, developers naturally learn to avoid writing code that is hard to maintain.
  • Exploratory static analysis turns out to be a pretty good way to learn about a code base as well. Instead of the typical approach of opening the code base in an IDE and poking around or stepping through it, developers can approach the code base instead by saying “show me the most heavily used classes and which classes use them.” Some tools also provide visual representations of the flow of an application and its dependencies, further reducing the learning curve developers face with a large code base.
  • And a final and important benefit is that static analysis improves developers’ skills and makes them better at their craft. Developers don’t just learn to avoid mistakes, as I mentioned in the mistake reduction bullet point, but they also learn which coding practices are generally considered good ideas by the industry at large and which practices are not. The compiler will tell you that things are illegal and warn you that others are probably errors, but static analysis tools often answer the question “is this a good idea.” Over time, developers start to understand subtle nuances of software engineering.

There are a couple of criticisms of static analysis. The main ones are that the tools can be expensive and that they can create a lot of “noise” or “false positives.” The former is a problem for obvious reasons and the latter can have the effect of counteracting the time savings by forcing developers to weed through non-issues in order to find real ones. However, good static analysis tools mitigate the false positives in various ways, an important one being to allow the shutting off of warnings and the customization of what information you receive. NDepend turns out to mitigate both: it is highly customizable and not very expensive.


The contents of this post were mostly taken from a Pluralsight course I did on static analysis with NDepend. Here is a link to that course. If you’re not a Pluralsight subscriber but are interested in taking a look at the course or at the library in general, send me an email to erik at daedtech and I can give you a 7 day trial subscription.


Merging Done Right: Semantic Merge

There are few things in software development as surprisingly political as merging your code when there are conflicts. Your first reaction to this is probably to think that I’m crazy, but seriously, think about what happens when your diff tool/source control combo tells you there’s a conflict. You peer at the conflict for a moment and then think, “alright, who did this?” Was it Jim? Well, Jim’s kind of annoying and pretty new, so it’s probably fine just to blow his changes away and send him an email telling him to get the latest code and re-do his stuff. Oh, wait, no, it looks like it was Janet. Uh oh. She’s pretty sharp and a Principal so you’ll probably be both wrong and in trouble if you mess this up — better just revert your changes, get hers and rework your stuff. Oh, on third look, it appears that it was Steve, and since Steve is your buddy, you’ll just go grab him and work through the conflict together.

Notice how none of this has anything to do with what the code should actually look like?

Now, I’ll grant that this isn’t always the case; there are times when you can figure out what should be in the master copy of the source control, but it’s pretty likely that you’ve sat staring at merge conflicts and thinking about people and not code. Why is that? Well, frankly because merge tools aren’t very good at telling you the story of what’s happened, and that’s why you need a human to come tell you the story. But which human, which story, and how interested you are in that interaction are all squarely the stuff of group dynamics and internal politics. Hopefully you get on well with your team and they’re happy to tell you the story.

But what if your tools could tell you that story? What if, instead of saying, “Jim made text different on lines, 100, 124, 135-198, 220, 222-228,” your tooling said, “Jim moved a method, and deleted a few references to a field whereas you edited the method that he moved?” Holy crap! You wouldn’t need to get Jim at all because you could just say, “oh, okay, I’ll do a merge where we do all of his stuff and then make my changes to that method he moved.”

I’ve been poking around with Roslyn and reading about it lately, and this led me to Semantic Merge. This is a diff tool that uses Roslyn, which means that it’s parsing your code into a syntax tree and actually reasoning about it as code, rather than text (or text with heuristics). As such, it’s no mirage or trickery that it can say things like “oh, Jim moved the method but left it intact whereas you made some changes to it.” It makes perfect sense that it can do this.

Let’s take a look at this in action. I’m only showing you the tiniest hint of what’s possible, but I’d like to pick out a very simple example of where a traditional merge tool kind of chokes and Semantic Merge shines. It is, after all, a pay to play (although pretty affordable) tool, so a compelling case should be made.

The Old Way

Before you see how cool Semantic Merge is, let’s take a look at a typical diff scenario. I’ll do this using the Visual Studio compare tool that I use on a day to day basis. And I’m calling this “the old way,” in spite of the fact that I fell in love with this as compared to the way it used to be in VS2010 and earlier. It’s actually pretty nice as far as diff tools go. I’m going to take a class and make a series of changes to it. Here’s the before:

Now, what I’m going to do is swap the positions of PrintNumbers() and ChangeTheWord(), add some error checking to ChangeTheWord() and delete the comments above the constructor. Here’s the after:

If I now want to compare these two files using the diff tool, here’s what I’m looking at:


This is the point where I groan and mutter to myself because it annoys me that the tool is comparing the methods side by side as if I renamed one and completely altered its contents entirely. I’m sure you can empathize. You’re muttering to yourself too and what you’re saying is, “you idiot tool, it’s obviously a completely different method.” Well, here’s the same thing as summarized by Semantic Merge:


It shows me that there are two types of differences here: moves and changes. I’ve moved the two methods PrintNumbers() and ChangeTheWord() and I’ve changed the constructor of the class (removing comments) and the ChangeTheWord() method. Pretty awesome, huh? Rather than a bunch of screenshots to show you the rest, however, however, I’ll show you this quick clip of me playing around with it.

Some very cool stuff in there. First of all, I started where the screenshot left off — with a nice, succinct summary of what’s changed. From there you can see that it’s easy to flip back and forth between the methods, even when moved, to see how they’re different. You can view each version of the source as well as a quick diff only of the relevant, apples-to-apples, changes. It’s also nice, in general, that you can observe the changes according to what kind of change they are (modification, move, etc). And finally, at the end, I played a bit with the more traditional diff view that you’re used to — side by side text comparison. But even with that, helpful UI context shows you that things have moved rather than the screenshot of the VS merge tool above where it looks like you’ve just butchered two different methods.

This is only scratching the surface of Semantic Merge. There are more features I haven’t covered at all, including a killer feature that helps auto-resolve conflicts by taking the base version of the code as well as server and local in order to figure out if there are changes only really made by one person. You can check more of it out in this extended video about the tool. As I’ve said, it’s a pay tool, but the cost isn’t much and there’s a 30 day trial, so I’d definitely advise taking it for a spin if you work in a group and find yourself doing any merging at all.


Getting Started on the Roslyn Journey

It’s not as though it’s new; Roslyn CTP was announced in the fall of 2011, and people have been able to play with it since then. Roslyn is a quietly ground-breaking concept — a set of compilers that exposes compiling, code modeling, refactoring, and analysis APIs. Oh, and it was recently announced that the tool would be open source meaning that all of you Monday morning quarterback language authors out there can take a crack at implementing multiple inheritance or whatever other language horrors you have in mind.

I have to say that I, personally, have little interest in modifying any of the language compilers (unless I went to work on a language team, which would actually be a blast, I think), but I’m very interested in the project itself. This strikes me as such an incredible, ground-breaking concept and I think a lot of people are just kind of looking at this as a curiosity for real language nerds and Microsoft fanboys. The essential value in this offering, to me, is the standardizing of code as data. I’ve written about this once before, and I think that gets lost in the shuffle when there’s talk about emitting IL at runtime and infinite loops of code generation and whatnot. Forget the idea of dispatching a service call to turn blobs of text into executables at runtime and let’s agree later to talk instead about the transformative notion of regarding source code as entity collections rather than instruction sheets, scripts, or recipes.

But first, let’s get going with Roslyn. I’m going to assume you’ve never heard of this before and I’m going to take you from that state of affairs to doing something interesting with it in this post. In subsequent/later posts, we’ll dive back into what I’m driving at philosophically in the intro to this post about code as data.

Getting Started

(Note — I have VS2013 on all my machines and that is what I’ve used. I don’t know whether any/all of this would work in Studio 2012 or earlier, so buyer beware)

First things first. In order to use the latest Roslyn bits, you actually need a fairly recent version of Nuget. This caught me off guard, so hopefully I’ll save you some research and digging. Go to “Tools” menu and choose “Extensions and Updates.” Click on the “Updates” section at the left, and then click on “Visual Studio Gallery.”


If you’re like me, your version was 2.7.something and it needs to be 2.8.1something or higher. This update will get you where you need to be. Once you’ve done that, you can simply install the API libraries via Nuget command line.

With that done, you’re ready to download the necessary installation files from Microsoft. Go to http://aka.ms/roslyn to get started. If you’re not signed in, you’ll be prompted to sign in with your Microsoft ID (you’ll need to create one if you don’t have one) and then fill out a survey. If you get lost along the way, your ultimate destination is to wind up here.

At this point, if you follow the beaten path and click the “Download” button, you’ll get something called download.dlm that, if your environment is like mine, is completely useless. So don’t do that. Click the circled “download” link indicated below to get the actual Roslyn SDK.


Once that downloads, unpack the zip file and run “Roslyn End User Preview” to install Roslyn language features. Now you can access the APIs and try out interesting new language features, like this one:

That’s all well and good for dog-fooding IDE changes and previewing new language features, but if you want access to the coolness from an API perspective, it’s time to fire up Nuget. Open up a project, and then the Nuget command line and type “Install-Package Microsoft.CodeAnalysis -Pre”

Once that finishes up, make your main entry point consist of the following code:

At this point, if you hit F5, what you’re going to see on the screen is a list of the fields contained in the class that you specify as your “sourceCodePath” variable (at least you will with the happy path — I haven’t tested this extensively to see if I can write classes that break it). Now, could you simply write a text parser (or, God forbid, some kind of horrible regex) to do this? Sure. Are there C# language modeling utilities like a Code DOM that would let you do this? Sure. Are any of these things the C# compiler? Nope. Just this.

So think about what this means. You’re not writing a utility that uses a popular C# source code modeling abstraction; you’re writing a utility that says, “hey, compiler, what are the fields in this source code?” And that’s pretty awesome.

My purpose here was to give you a path from “what’s this Roslyn thing anyway” to “wow, look at that, I can write a query against my own code.” Hopefully you’ve gotten that out of this, and hopefully you’ll go forth, tinker, and then you can come back and show me some cool tricks.


NCrunch and Continuous Testing: The Must-Have Setup

Most of this post was taken from the transcript of my Pluralsight course on NCrunch. If you are interested in watching the course but are not a Pluralsight subscriber, feel free to email me or leave a comment requesting a trial, and I’ll get you a 7 day subscription to check it out.

Understanding the Legitimate, Root-Cause Objection to TDD

In my experience, there are three basic “camps” of reactions to the concept of test driven development (TDD) from those not experienced with it: willing students, healthy skeptics and reactionary curmudgeons. The first group is basically looking for a chance to practice and needs no convincing. The last group will have to be dragged along, kicking and screaming, and so there’s no persuading them without the threat of negative consequences. It is the middle group that tends to have rational objections, some of which are well-founded and others of which aren’t so much. A lot of the negative reaction from this group is the result of reacting to the misconceptions that I mentioned in this post about what TDD is and isn’t. But even once they understand how it works, there are still some fairly common and legitimate objections that are not simply straw man arguments.

  1. The most common and prevalent objection is that coding this way means that you’re doing a lot more work. You’re taking more time and writing more code and people don’t necessarily see the benefit, especially in cases where they already know what code they want to write.
  2. Many of the misconception objections and other inexplicable resistance is really the result of people simply not knowing how to write tests or practice TDD, and perhaps at times being reluctant to admit it. Others may freely admit it. Either way, the objection is that TDD, like any other discipline, would take time to learn and require an investment of effort.
  3. There is also more code that is going into a project since you now have an additional test class for each single class you would otherwise have created. More code means more maintenance time and effort.
  4. Many astute observers also realize that a lot of legacy code, particularly that involving large-work constructors, singletons, and static state is very hard to test, making attempts to do so effort-intensive.
  5. And, along the same lines , they also realize that there would be more effort required than simply learning how to do TDD – it would also mean learning different design techniques such as dependency injection, polymorphism, and inversion of control.

When you consider all of these objections, they all have a common thread. At the core of it, they’re really all variants on the theme of not having enough time. Writing the tests, maintaining the test code, learning new ways of doing things, and applying them to new and old code are all things that take time, and for most developers, time is precious. Someone selling TDD is a lot like someone selling you on a 401K: they’re convincing you that sacrificing now is going to be worth it later and asking you to take this, to some degree, on faith.

Could TDD be better?

Justifying the adoption of TDD to a healthy skeptic hinges largely on demonstrating that it provides a net benefit in terms of time, and thus cost. So how can these objections be reconciled and the concerns addressed?

Well first up are the learning curve oriented objections. And the truth is that there’s no way around this one being a time sink. Learning how to do TDD and learning how to write testable code are going to take time, no matter what. If you do not have the time to learn, this is a perfectly valid objection, but only in the shorter term. After all, we work in an industry where change is the only constant and learning new languages, frameworks, and methodologies is pretty much table stakes for staying relevant.

Regarding development time overall, a very common argument made by TDD proponents is that the practice saves time over the long haul. This is reminiscent of the parable of the tortoise and the hare where the TDD practitioner is a tortoise plodding along, getting everything right and the hare is generating reams of code quickly but with mistakes. The hare will declare himself done more quickly, but he’ll spend a lot more time later troubleshooting, reading log files, debugging, and fixing errors. The tortoise may not finish as quickly, but when he does, he truly is done.

But what about in the short term? Is there anything that can be done to make things go more quickly in the short term for TDD practitioners? Could we strap a rocket pack to the tortoise and make him go faster than the hare while preserving his accuracy?


Speeding up the Feedback Loop

What if I told you a story? What if I told you that you could write code and know whether or not it was working nearly instantaneously? In this world of development, you don’t have to wait while your application starts up, and then navigate through various user interface screens to get to the action that will trigger the bit of code you want to verify. There is no more repetitive clicking and typing and waiting for screens to load. In fact, in this world you don’t even need to build your project or compile your code. All you need to do is type and see, as you’re typing, whether or not the changes you’re making are right. And, you can see a visual metric for how much confidence you can have in your changes by virtue of how much your code is covered by the unit tests.
Does that story sound too good to be true? Well, I’ll admit that it does sound pretty good, but I’ll let you in on a little secret – it is true. There is a name for this paradigm, and it’s called “continuous testing.” And there are various tools out there for different platforms that make it a reality, right now as we speak.

To understand the magic of continuous testing, it’s essential to understand one of the most important, but often overlooked, concepts in computer science. I’m talking about the feedback loop. At its core, programming is a series of experiments. Whenever you approach a programming task, you have a code base that does something, and you have a goal to make it do something different or new. To achieve this goal, you identify intermediate behaviors that you’d like to see to mark progress, and then you make changes that you think will result in those behaviors. Then, you run the application to see if what you thought would happen does, in fact, happen.

For example, perhaps you want to have your application display customer information stored in a database to the screen when the user clicks a certain button. You might first say “forget the database – let’s just get the button click to result in some hard-coded value being displayed,” and then set about altering the code to make that happen. When you’d made your changes to the code, you’d run the program and click that button to see what had happened.

Considered closely, this process is actually a lot like the scientific method. For step (1) you read the code. For step (2) you hypothesize what you’ll need to do to the code. For step (3) you predict the outcome of your changes, and for step (4) you make the changes and observe the results. The amount of time that it takes to perform an iteration of your coding version of the scientific method is what I’m calling the “feedback loop.” How long does it take for you to have an idea, implement it, and verify that it had the desired effect?


In the early days of programming when the use of punch cards was common, feedback times were very lengthy. Programmers would reason carefully about everything that they did because feedback times were extremely slow, meaning mistakes were very costly. While many improvements have been made across the board to feedback times, situations persist to this day when the feedback loop is excruciatingly slow. This includes long running or resource-intensive applications and distributed systems with high latency. With such systems, programmers on projects often devise schemes to try to shorten the feedback loop, such as mocking out bottlenecks to allow fast verification of the rest of the system.

What they’re really trying to do is shorten the feedback loop to allow themselves to be more productive. When a great deal of time elapses between trying something and seeing what happens, attention tends to wander to distractions like twitter or reddit, exacerbating the inefficiency in this already-slow process. Developers innately understand this problem and are frustrated by the long build and run times of behemoth and slow-running applications.

To combat this problem, developers intuitively favor faster schemes. Ask yourself whether you prefer to work on a small project that builds quickly or a large one. How about a slow test suite versus a fast one? By speeding up the feedback loop you trade frustration and wandering attention span for engagement and a feeling of accomplishment. Techniques like relying on fast-running unit tests and keeping modules small and decoupled help a great deal with this, but we can get even faster.

If short feedback is good, immediate is definitely better. Anyone who has done extensive work at a command line or, in general used a Read-Evaluate-Print-Loop (REPL) understands this. Attention does not wander at all during a session like this. Historically, such a thing wasn’t possible in a compiled language, but with the advent of multicore systems and increasingly sophisticated compiler technology, times are changing. It is now possible to have a build running in the background of an IDE like Visual Studio even as you modify the code.


If you’ve been watching my series on building a chess game using TDD you couldn’t help but notice the red and green dots on the left side of the code window, since they catch the eye. What you were seeing was the tool, NCrunch, in action. Now it’s time to get properly acquainted.

NCrunch is a software product by Remco Software and was written by software developer Remco Mulder, who owns the company. It is a tool written specifically to allow developers to practice continuous testing in Visual Studio. NCrunch is a commercial product with a tiered pricing model and full-blown customer support. And it operates as a plugin to Visual Studio so there is no need to integrate or operate any kind of standalone application. It drops right in, comfortably with a tool with which you are already familiar.

For the first several years of its existence, NCrunch was free, since it was in an extended state of Beta release. During the course of these years, it grew a substantial and loyal user base. In the fall of 2012, Remco decided to issue version 1.0 and release NCrunch as a full, commercial product with a licensing model and production support. It is now on version 2.5 and is most certainly an excellent, commercial-grade product that is worth every penny.

As I write my code using this tool, you may notice things that I rarely or never do. I rarely, if ever, run an application. I rarely, if ever, use the unit test runner. I rarely even compile my code, though I do this sometimes simply because I happen to be quite accustomed to looking at compiler feedback in the errors window. Continuous testing tools like NCrunch may have been a novelty when they came out, but I would argue that they’re rapidly becoming table stakes for efficient development these days.

Before NCrunch, the viability of TDD for me was tied up in the idea that investing extra time up front meant that I wouldn’t later be revisiting my code, debugging, tweaking, fixing, when I was further removed and it’s more time consuming. With NCrunch, I don’t even need to make that case. Now, if you took TDD and NCrunch away, my development process would be substantially slower as I sat there waiting for the application to compile or the test runner to do its thing.

If you don’t have this, get it. You won’t be sorry. Forget clean code, unit testing, TDD, all of that stuff (well, not really — but indulge me here for a second). Just get this setup for the tight feedback loop alone. There is nothing like the feeling of productivity you get from typing a line of code and knowing in less than a second, without doing anything else, whether the change is what you want. That incredible power makes it all worth it — the learning curve of the tool, the cost of the tool, adopting TDD, learning to unit test. It’s like getting a car with 500 horse power and feeling that acceleration; it ruins you for anything less.

Acknowledgements | Contact | About | Social Media