Stories about Software


The Most Important Code Metrics You’ve Never Heard Of

Editorial Note: I originally wrote this post for the NDepend blog.  Head on over and check out the original.  If software architecture interests you or you aspire to that title, there’s a pretty focused set of topics that will interest you.

Oh how I hope you don’t measure developer productivity by lines of code.  As Bill Gates once ably put it, “measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”  No doubt, you have other, better reasoned metrics that you capture for visible progress and quality barometers.  Automated test coverage is popular (though be careful with that one).  Counts of defects or trends in defect reduction are another one.  And of course, in our modern, agile world, sprint velocity is ubiquitous.


But today, I’d like to venture off the beaten path a bit and take you through some metrics that might be unfamiliar to you, particularly if you’re no longer technical (or weren’t ever).  But don’t leave if that describes you — I’ll help you understand the significance of these metrics, even if you won’t necessarily understand all of the nitty-gritty details.

Perhaps the most significant factor here is that the metrics I’ll go through can be tied, relatively easily, to stakeholder value in projects.  In other words, I won’t just tell you the significance of the metrics in terms of what they say about the code.  I’ll also describe what they mean for people invested in the project’s outcome.

Read More


Using NDepend to Make You a Better Programmer

This is another post that I originally wrote for the NDepend blog. If you haven’t yet, go check out the NDepend blog and sign up for the RSS feed. It’s relatively new, but we’ll have a lot of good content there for you.

If you’re a software developer, particularly of the newly minted variety, the concept of static analysis might not seem approachable.  It sounds academic.  It sounds architect-y.  It sounds complicated.  I’ve seen this reaction from a lot of people in my career and I think that’s too bad.

If you delve into its complex depths, static analysis can be any and all of these things, but with the developers I mentor and coach, I like to introduce it as a game that makes you better at what you do.  You can use static analysis to give yourself feedback about your code that is both fast and anonymous, allowing you to improve via trial and error, rather than by soliciting feedback from people much more tenured than you and sometimes wincing as they lay into you a little.  And, perhaps best of all, you can calibrate the quality of your code with the broader development world, rather than just pleasing the guy who has hung around your company long enough to default his way into the “tech lead” role.

NDepend Rules

Take a look at some of the feedback that NDepend offers about your code.  “That method is too big” isn’t particularly intimidating, is it?  I mean, you might wonder at what you could do to compact a method, but it’s not some kind of esoteric rule written in gibberish.  You run NDepend on your code and you can see that there is some number of methods that the broader development community considers to be “too big.”

From there, you can start looking at ways to write smaller methods and to refactor some of your current ones to sneak in under the warning number.  This is the essence of gamification — you change the way you write code to get rid of the warnings.  You get better.  And it’s gratifying.

As you do this, another interesting thing starts to happen.  You start noticing that other developers continue to write large methods and when you run NDepend on their code, they light up the console with errors, whereas you do not with your code.  And so, you can have conversations with them that start with, “you know, this static analysis tool I’ve been using wants us to have smaller methods, and I’ve been working a lot on that, if you ever want a hand.”

You gain a reputation as being knowledgeable.  Before you know it, you can cite widely accepted static analysis rules and the design goals they imply.  You know these rules, and, via gamification, you have experience molding code to comply with them.  Even in cases where you might wind up overruled by the local team lead or architect, it’s no longer a simple matter of that person saying, “because I said so,” and just ending the conversation.  They have to engage with you and present cogent counter-arguments to your points.  You’re participating in important discussions in ways that you never have before.

If it sounds like I’m speaking from experience, I am.  Throughout my career, I’ve been relentless about figuring out ways to improve my craft, always trying to be a better programmer.  Early on, I was unsatisfied with a lot of arguments among developers around me that I knew boiled down to nothing more than personal preference, so I went out in search of empirical methods and broader knowledge, and that search brought me to static analysis.  I read about data and science behind particular choices in approaching software, and I schooled myself to adopt the approaches that had brought the best results.

Somewhere along that journey, I discovered NDepend and its effect on my approach to writing code was profound.  My methods shrank and became less complicated.  My architectural and design skills improved as I made it a point to avoid dependency cycles and needless coupling.  I boosted unit test coverage and learned well established language practices.  It was not long before people routinely asked me for design advice and code reviews.  And from there, it wasn’t long before I occupied actual lead and architect roles.

So, if you want to improve your craft and nudge your career along, don’t pass on static analysis, and don’t pass on NDepend.  NDepend is not just a tool for architects; it’s a tool for creating architects from the ranks of developers.  You’ll up your game, improve your craft, and even have some fun doing it.


Your Code Is Data

This is a post that I originally wrote for the NDepend blog. If you haven’t already, go check it out! We’re building out some good content over there around static analysis, with lots more to follow.

A lot of programmers have some idea of what static analysis is, as least superficially.  If I mention the term, what pops into your head?  Automatic enforcement of coding standards?  StyleCop or FXCop?  Cyclomatic complexity and Visual Studio’s “maintainability index?”  Maybe you’re deeply familiar with all of the subtleties and nuances of the technique.

Whatever your level of familiarity, I’d like to throw what might be a bit of a curve ball at you.  Static analysis is the idea of analyzing source code and byte code for various properties and reporting on those properties, but it’s also, philosophically, the idea of treating code as data.  This is deeply weird to us as application developers, since we’re very much used to thinking of source code as instructions, procedures, and algorithms.  But it’s also deeply powerful.


When you think of source code this way, typical static analysis use cases make sense.  FXCop asks questions along the lines of “How many private fields not prepended with underscores,” or, perhaps, “SELECT COUNT(class_field) FROM classes WHERE class_field NOT LIKE ‘_*’”  More design-focused source code analysis tools ask questions like “What is the cyclomatic complexity of my methods,” or, perhaps, “SELECT cyclomatic_complexity FROM Methods.”

But if code is data, and static analysis tools are sets of queries against that data, doesn’t it seem strange that we can’t put together and execute ad-hoc queries the way that you would with a relational (or other) database?  I mean, imagine if you built out some persistence store using SQL Server, and the only queries you were allowed were SELECT * from the various tables and a handful of others.  Anything beyond that, and you would have to inspect the data manually and make notes by hand.  That would seem arbitrarily and even criminally restrictive.  So why doesn’t it seem that way with our source code?  Why are we content not having the ability to execute arbitrary queries?

I say “we” but the reality is that I can’t include myself in that question, since I have that ability and I would consider having it taken away from me to be crippling.  My background is that of a software architect, but beyond that, I’m also a software craftsmanship coach, teacher, and frequent analyzer of codebases in a professional capacity, auditing a wide variety of them for various properties, characteristics, and trends.  If I couldn’t perform ad-hoc, situation-dependent queries against the source code, I would be far less effective in these roles.

My tools of choice for doing this are NDepend and its cousin JArchitect (for Java code bases).  Out of the box, they’re standard static analysis and architecture tools, but they also offer this incredibly powerful concept called CQLinq that is, for all intents and purposes, SQL for the ‘schema’ of source code.  In reality, CQLinq is actually a Linq provider for writing declarative code queries, but anyone that knows SQL (or functional programming or lamba expressions) will feel quite at home creating queries.

Let’s say, for instance, that you’re the architect for a C# code base and you notice a disturbing trend wherein the developers have taken to communicating between classes using global variables.  What course of action would you take to nip this in the bud?  I bet it would be something annoying for both you and them.  Perhaps you’d set a policy for a while where you audited literally every commit and read through to make sure they weren’t doing it.  Maybe you’d be too pressed for time and you’d appoint designated globals cops.  Or, perhaps you’d just send out a lot of angry, threatening emails?

Do you know what I would do?  I’d just write a single CQLinq query and add it to a step in my automated team build that executed static analysis code rules against all commits.  If the count of global variable invocations in the code base was greater after the commit than before it, the build would fail.  No need for anger, emails or time wasted checking over people’s shoulders, metaphorically or literally.

Want to see how easy a query like this would be to write?  Why don’t I show you…

That’s it. I write that query, set the build to run NDepend’s static analysis, and fail if there are warnings. No more sending out emails, pleading, nagging, threatening, wheedling, coaxing, or bottleneck code reviewing. And, most important of all, no more doing all of that and having problems anyway. One simple little piece of code, and you can totally automate preventing badness. And best of all, the developers get quick feedback and learn on their own.

As I’ve said, code is data at its core.  This is especially true if you’re an architect, responsible for the long term health of the code base.  You need to be able to assess characteristics and properties of that code, make decisions about it, and set precedent.  To accomplish this, you need powerful tooling for querying your code, and NDepend, with its CQLinq, provides exactly that.


Introduction to Static Analysis (A Teaser for NDepend)

Rather than the traditional lecture approach of providing an official definition and then discussing the subject in more detail, I’m going to show you what static analysis is and then define it. Take a look at the following code and think for a second about what you see. What’s going to happen when we run this code?

Well, let’s take a look:


I bet you saw this coming. In a program that does nothing but set x to 1, and then throw an exception if x is 1, it isn’t hard to figure out that the result of running it will be an unhandled exception. What you just did there was static analysis.

Static analysis comes in many shapes and sizes. When you simply inspect your code and reason about what it will do, you are performing static analysis. When you submit your code to a peer to have her review, she does the same thing. Like you and your peer, compilers perform static analysis, though automated analysis instead of manual. They check the code for syntax errors or linking errors that would guarantee failures, and they will also provide warnings about potential problems such as unreachable code or assignment instead of evaluation. Products also exist that will check your source code for certain characteristics and stylistic guideline conformance rather than worrying about what happens at runtime and, in managed languages, products exist that will analyze your compiled IL or byte code and check for certain characteristics. The common thread here is that all of these examples of static analysis involve analyzing your code without actually executing it.

Analysis vs Reactionary Inspection

People’s interactions with their code tend to gravitate away from analysis. Whether it’s unit tests and TDD, integration tests, or simply running the application to see what happens, programmers tend to run experiments with their code and then to see what happens. This is known as a feedback loop, and programmers use the feedback to guide what they’re going to do next. While obviously some thought is given to what impact changes to the code will have, the natural tendency is to adopt an “I’ll believe it when I see it” mentality.

We tend to ask “what happened?” and we tend to orient our code in such ways as to give ourselves answers to that question. In this code sample, if we want to know what happened, we execute the program and see what prints. This is the opposite of static analysis in that nobody is trying to reason about what will happen ahead of time, but rather the goal is to do it, see what the outcome is, and then react as needed to continue.

Reactionary inspection comes in a variety of forms, such as debugging, examining log files, observing the behavior of a GUI, etc.

Static vs Dynamic Analysis

The conclusions and decisions that arise from the reactionary inspection question of “what happened” are known as dynamic analysis. Dynamic analysis is, more formally, inspection of the behavior of a running system. This means that it is an analysis of characteristics of the program that include things like how much memory it consumes, how reliably it runs, how much data it pulls from the database, and generally whether it correctly satisfies the requirements are not.

Assuming that static analysis of a system is taking place at all, dynamic analysis takes over where static analysis is not sufficient. This includes situations where unpredictable externalities such as user inputs or hardware interrupts are involved. It also involves situations where static analysis is simply not computationally feasible, such as in any system of real complexity.

As a result, the interplay between static analysis and dynamic analysis tends to be that static analysis is a first line of defense designed to catch obvious problems early. Besides that, it also functions as a canary in the mine to detect so-called “code smells.” A code smell is a piece of code that is often, but not necessarily, indicative of a problem. Static analysis can thus be used as an early detection system for obvious or likely problems, and dynamic analysis has to be sufficient for the rest.


Source Code Parsing vs. Compile-Time Analysis

As I alluded to in the “static analysis in broad terms” section, not all static analysis is created equal. There are types of static analysis that rely on simple inspection of the source code. These include the manual source code analysis techniques such as reasoning about your own code or doing code review activities. They also include tools such as StyleCop that simply parse the source code and make simple assertions about it to provide feedback. For instance, it might read a code file containing the word “class” and see that the next word after it is not capitalized and return a warning that class names should be capitalized.

This stands in contrast to what I’ll call compile time analysis. The difference is that this form of analysis requires an encyclopedic understanding of how the compiler behaves or else the ability to analyze the compiled product. This set of options obviously includes the compiler which will fail on show stopper problems and generate helpful warning information as well. It also includes enhanced rules engines that understand the rules of the compiler and can use this to infer a larger set of warnings and potential problems than those that come out of the box with the compiler. Beyond that is a set of IDE plugins that perform asynchronous compilation and offer realtime feedback about possible problems. Examples of this in the .NET world include Resharper and CodeRush. And finally, there are analysis tools that look at the compiled assembly outputs and give feedback based on them. NDepend is an example of this, though it includes other approaches mentioned here as well.

The important compare-contrast point to understand here is that source analysis is easier to understand conceptually and generally faster while compile-time analysis is more resource intensive and generally more thorough.

The Types of Static Analysis

So far I’ve compared static analysis to dynamic and ex post facto analysis and I’ve compared mechanisms for how static analysis is conducted. Let’s now take a look at some different kinds of static analysis from the perspective of their goals. This list is not necessarily exhaustive, but rather a general categorization of the different types of static analysis with which I’ve worked.

  • Style checking is examining source code to see if it conforms to cosmetic code standards
  • Best Practices checking is examining the code to see if it conforms to commonly accepted coding practices. This might include things like not using goto statements or not having empty catch blocks
  • Contract programming is the enforcement of preconditions, invariants and postconditions
  • Issue/Bug alert is static analysis designed to detect likely mistakes or error conditions
  • Verification is an attempt to prove that the program is behaving according to specifications
  • Fact finding is analysis that lets you retrieve statistical information about your application’s code and architecture

There are many tools out there that provide functionality for one or more of these, but NDepend provides perhaps the most comprehensive support across the board for different static analysis goals of any .NET tool out there. You will thus get to see in-depth examples of many of these, particularly the fact finding and issue alerting types of analysis.

A Quick Overview of Some Example Metrics

Up to this point, I’ve talked a lot in generalities, so let’s look at some actual examples of things that you might learn from static analysis about your code base. The actual questions you could ask and answer are pretty much endless, so this is intended just to give you a sample of what you can know.

  • Is every class and method in the code base in Pascal case?
  • Are there any potential null dereferences of parameters in the code?
  • Are there instances of copy and paste programming?
  • What is the average number of lines of code per class? Per method?
  • How loosely or tightly coupled is the architecture?
  • What classes would be the most risky to change?

Believe it or not, it is quite possible to answer all of these questions without compiling or manually inspecting your code in time consuming fashion. There are plenty of tools out there that can offer answers to some questions like this that you might have, but in my experience, none can answer as many, in as much depth, and with as much customizability as NDepend.

Why Do This?

So all that being said, is this worth doing? Why should you watch the subsequent modules if you aren’t convinced that this is something that’s even worth learning. It’s a valid concern, but I assure you that it is most definitely worth doing.

  • The later you find an issue, typically, the more expensive it is to fix. Catching a mistake seconds after you make it, as with a typo, is as cheap as it gets. Having QA catch it a few weeks after the fact means that you have to remember what was going on, find it in the debugger, and then figure out how to fix it, which means more time and cost. Fixing an issue that’s blowing up in production costs time and effort, but also business and reputation. So anything that exposes issues earlier saves the business money, and static analysis is all about helping you find issues, or at least potential issues, as early as possible.
  • But beyond just allowing you to catch mistakes earlier, static analysis actually reduces the number of mistakes that happen in the first place. The reason for this is that static analysis helps developers discover mistakes right after making them, which reinforces cause and effect a lot better. The end result? They learn faster not to make the mistakes they’d been making, causing fewer errors overall.
  • Another important benefit is that maintenance of code becomes easier. By alerting you to the presence of “code smells,” static analysis tools are giving you feedback as to which areas of your code are difficult to maintain, brittle, and generally problematic. With this information laid bare and easily accessible, developers naturally learn to avoid writing code that is hard to maintain.
  • Exploratory static analysis turns out to be a pretty good way to learn about a code base as well. Instead of the typical approach of opening the code base in an IDE and poking around or stepping through it, developers can approach the code base instead by saying “show me the most heavily used classes and which classes use them.” Some tools also provide visual representations of the flow of an application and its dependencies, further reducing the learning curve developers face with a large code base.
  • And a final and important benefit is that static analysis improves developers’ skills and makes them better at their craft. Developers don’t just learn to avoid mistakes, as I mentioned in the mistake reduction bullet point, but they also learn which coding practices are generally considered good ideas by the industry at large and which practices are not. The compiler will tell you that things are illegal and warn you that others are probably errors, but static analysis tools often answer the question “is this a good idea.” Over time, developers start to understand subtle nuances of software engineering.

There are a couple of criticisms of static analysis. The main ones are that the tools can be expensive and that they can create a lot of “noise” or “false positives.” The former is a problem for obvious reasons and the latter can have the effect of counteracting the time savings by forcing developers to weed through non-issues in order to find real ones. However, good static analysis tools mitigate the false positives in various ways, an important one being to allow the shutting off of warnings and the customization of what information you receive. NDepend turns out to mitigate both: it is highly customizable and not very expensive.


The contents of this post were mostly taken from a Pluralsight course I did on static analysis with NDepend. Here is a link to that course. If you’re not a Pluralsight subscriber but are interested in taking a look at the course or at the library in general, send me an email to erik at daedtech and I can give you a 7 day trial subscription.


Static Analysis, NDepend, and a Pluralsight Course

I absolutely love statistics. Not statistics as in the school subject — I don’t particularly love that branch of mathematics with its binomial distributions and standard deviations and whatnot. I once remarked to a friend in college that statistics-the-subject seemed like the ‘science’ of taking a guess and then rigorously figuring out how wrong you were. Flippant as that assessment may have been, statistics-the subject has hardly the elegant smoothness of calculus or the relentlessly logical pursuit of discrete math. Not that it isn’t interesting at all — to a math geek like me, it’s all good — but it just isn’t really tops on my list.

But what is fascinating to me is tabulating outcomes and gamification. I love watching various sporting events on television and keep track of odd things. When watching a basketball game, I always the amount of a “run” the teams are on before the announcers think to say something like “Chicago is on a 15-4 run over the last 6:33 this quarter.” I could have told you that. In football, if the quarterback is approaching a fist half passing record, I’m calculating the tally mentally after every play and keeping track. Heck, I regularly watch poker on television not because of the scintillating personalities at the tables but because I just like seeing what cards come out, what hands win, and whether the game is statistically normal or aberrant. This extends all the way back to my childhood when things like my standardized test scores and my class rank were dramatically altered by me learning that someone was keeping score and ranking them.

I’m not sure what it is that drives this personality quirk of mine, but you can imagine what happened some years back when I discovered static analysis and then NDepend. I was hooked. Before I understood what the Henderson Sellers Lack of Cohesion in Methods score was, I knew that I wanted mine to be lower than other people’s. For those of you not familiar, static analysis is a way to examine your code without actually executing it and seeing what happens retroactively. Static analysis, (over) simplified, is an activity that examines your source code and makes educated guesses about how it will behave at runtime and beyond (i.e. maintenance). NDepend is a tool that performs static analysis at a level and with an amount of detail that makes it, in my opinion, the best game in town.

After overcoming an initial pointless gamification impulse, I learned to harness it instead. I read up on every metric under the sun and started to understand what high and low scores correlated with in code bases. In other words, I studied properties of good code bases and bad code bases, as described by these metrics, and started to rely on my own extreme gamification tendencies in order to drive my work toward better code. It wasn’t just a matter of getting in the habit of limiting my methods to the absolute minimum in size or really thinking through the coupling in my code base. I started to learn when optimizing to improve one metric led to a decline in another — I learned lessons about design tradeoffs.

It was this behavior of seeking to prove myself via objective metrics that got me started, but it was the ability to ask and answer lots of questions about my code base that kept me coming back. I think that this is the real difference maker when it comes NDepend, at least for me. I can ask questions, and then I can visualize, chart and track the answer in just about every conceivable way. I have a “Moneyball” approach to code, and NDepend is like my version of the Jonah Hill character in that movie.

Because of my high opinion of this tool and its importance in the lives of developers, I made a Pluralsight course about it. If you have a subscription and have any interest in this subject at all, I invite you to check it out. If you’re not familiar with the subject, I’d say that if your interest in programming breaks toward architecture — if you’re an architect or an aspiring architect — you should also check it out. Static analysis will give you a huge leg up on your competition for architect roles, and my course will provide an introduction for getting started. If you don’t have a Pluralsight subscription, I highly recommend trying one out and/or getting one. This isn’t just a plug for me to sell a course I’ve made, either. I was a Pluralsight subscriber and fan before I ever became an author.

If you get a chance to check it out, I hope you enjoy.