Static Analysis — Spell Check for Code
A lot of people have caught onto certain programming trends: some agility in the process generally makes things better, unit testing a code base tends to make it more reliable, etc. One thing that, in my experience, seems to lag behind in popularity is the use of static checking tools. If these are used at all, it’s usually for some reason such as enforcing capitalization schemes of variables or some other such thing that guarantees code that is uniform in appearance.
I think this non-use or under-use of such tools is a shame. I recently gave a presentation on using some tools in C# and Visual Studio 2010 for static analysis, and thought I’d share my experience with some of the tools and the benefits I perceive here. In this development environment, there are six tools that I use, all to varying degrees and for different purposes. They are:
Before I get into that, I’ll offer a little background on the idea of static analysis. The age-old and time-tested way to write code is that you write some code, you compile it, and then you run it. If it does what you expected when you run it, then you declare yourself victorious and move on. If it doesn’t, then you write some more code and repeat.
This is all fine and good until the program starts getting complicated — interacting with users, performing file I/O, making network requests, etc. At that point, you get a lot of scenarios. In fact, you get more scenarios than you could have anticipated in one sitting. You might run it and everything looks fine, but then you hand it to a user who runs it, unplugs the computer, and wonders why his data wasn’t saved.
At some point, you need to be able to reason about how components of your code would behave in various scenarios, even if you might not easily be able to recreate these scenarios. Unit testing is helpful with this, but unit testing is just an automated way of saying, “run the code.” Static analysis automates the process of reasoning about the code without running it. It’s like you looking at the code, but exponentially more efficient and much less likely to make mistakes.
Doing this static analysis is adding an extra step to your development process. Make no mistake about that. It’s like unit testing in that the largest objection is going to be the ‘extra’ time that it takes. But it’s also like unit testing in that it saves you time downstream because it makes defects less likely to come back to bite you later. These two tasks are also complimentary and not stand-ins for one another. Unit testing clarifies and solidifies requirements and forces you to reason about your code. Static analysis lets you know if that clarification and reasoning has caused you to do something that isn’t good.
As I said in the title, it’s like a spell checker for your code. It prevents you from making silly and embarrassing mistakes (and often costly ones). To continue the metaphor, unit testing is more like getting someone bright to read your document. He’ll catch some mistakes and give you important feedback for how to improve the document, but he isn’t a spell checker.
So, that said, I’ll describe briefly each one and why I use and endorse it.
MS Analysis encapsulates FX Cop for the weightier version of Visual Studio 2010 (Premium and up, I think). It runs a series of checks, such as whether or not parameters are validated by methods, whether or not you’re throwing Exception instead of SomeSpecificException, and whether your classes have excessive coupling. There are probably a few hundred checks in all. When you do a build with this enabled, it populates the error list with violations in the form of warnings.
On the plus side, this integrates seamlessly with Visual Studio since it’s a Microsoft product, and it catches a lot of stuff. On the down side, it can be noisy, and customizing it isn’t particularly straightforward. You can turn rules on and off, but if you want to tweak existing ones or create your own, things get a little more complicated. It also isn’t especially granular. You configure it per project and can’t get more fine grained feedback than that (i.e. per namespace, class, or method).
My general use of it is to run it periodically to see if my code is violating any of the rules that I care about. I usually turn off a lot of rules, and I have a few different rulesets that I plug in and out so that I can do more directed searches.
StyleCop is designed to be run between writing and building. Instead of using the VM/Framework to reflect on your code, it just parses the source code file looking for stylistic concerns (are all of your fields camel cased and documented and are you still using Hungarian notation and, if so, stop) and very basic mistakes (like, do you have an empty method). It’s lightning fast, and it runs on a per-class basis, which is cool.
On the downside, it can be a little annoying and invasive, but the designers are obviously aware of this. I recall reading some kind of caveat stating that the nature of these types of rules tends to be arbitrary and get opinionated developers into shouting matches.
I find it useful for letting me know if I’ve forgotten to comment things, if I’ve left fields as anything other than private, and if I have extra parentheses somewhere. I run Style Cop occasionally, but not as often as others. Swapping between the rule sets is a little annoying.
CodeRush is awesome for a lot of things, and its static analysis is really an ancillary benefit. It maintains an “issues list” for each file and highlights these issues in real time, right in the IDE. A few of them are a little bizarre (suggesting to always use “var” keyword if it is possible), but most of them are actually really helpful and not suggested by the MS Tools or anything else I use. It does occasionally false flag dead code and get a few things wrong, but it’s fairly easy to configure it to ignore issues on a per file, per namespace, per solution basis.
The only real downside here is that CodeRush has a seat licensing cost and that and the other overhead of CodeRush make it a little overkill-ish if you’re just interested in Static Analysis. I fully endorse getting CodeRush in general, however, for all of its features.
Like CodeRush, this tool is really intended for something else, and it provides static analysis as a side effect. Code Contracts is an academically developed tool that facilitates design by contract. Pre- and post-conditions as well as class invariants can be enforced at build time. Probably because of the nature of doing this, it also just so happens to offer a feature wherein you can have warning squigglies pop up anywhere you might be dereferencing null, violating array bounds, or making invalid arithmetic assumptions.
To me, this is awesome, and I don’t know of other tools that do this. The only downside is that, on larger projects, this takes a long time to run. However, getting an automatic check for null dereferences is worth the wait!
I use it explicitly for the three things I mentioned, though, if I get a chance and enough time, I’d like to explore its design by contract properties as well.
NDepend is really kind of an architectural tool. It lets you make assessments about different dependencies in your code, and it provides you with all kinds of neat graphs that report on them. But my favorite feature of NDepend is the static analysis in the form of Code Querying. It exposes SQL-like semantics that let you write queries against your code base, such as “SELECT ALL Methods WHERE CyclomaticComplexity > 25″ (paraphrase). You can tweak these things, write your own, or go with the ones out of the box. They’re all commented, too, in ways that are helpful for understanding and modifying.
There is really no downside to NDepend aside from the fact that it costs some money. But if you have the money to spare, I highly recommend it. I use this all the time for querying my code bases in whatever manner strikes my fancy.
I think that Nitriq and NDepend are probably competitors, but I won’t do any kind of comparison evaluation because I only have the free version of Nitriq. Nitriq has the same kind of querying paradigm as NDepend, except that it uses LINQ semantics instead of SQL. That’s probably a little more C# developer friendly, as I suppose not everyone that writes C# code knows SQL (although it strikes me that all programmers ought to have at least passing familiarity with SQL).
In the free version you can only assess one assembly at a time, though I don’t consider that a weakness. I use Nitriq a lot less than NDepend, but when I do fire it up, the interface is a little less cluttered and it’s perhaps a bit more intuitive. Though, for all I know, the paid version may get complicated.
So, that’s my pitch for static analysis. The tools are out there, and I suspect that they’re only going to become more and more common. If this is new to you, check these tools out and try them! If you’re familiar with static analysis, hopefully there’s something here that’s new and worth investigating.