DaedTech

Stories about Software

By

Static Analysis and The Other Kind of False Positives

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at the NDepend site.  If you’re a fan of static analysis tools, do yourself a favor and take a look at the NDepend offering while you’re there.

A common complaint and source of resistance to the adoption of static analysis is the idea of false positives.  And I can understand this.  It requires only one case of running a tool on your codebase and seeing 27,834 warnings to color your view on such things forever.

There are any number of ways to react to such a state of affairs, though there are two that I commonly see.  These are as follows.

  1. Sheepish, rueful acknowledgement: “yeah, we’re pretty hopeless…”
  2. Defensive, indignant resistance: “look, we’re not perfect, but any tool that says our code is this bad is nitpicking to an insane degree.”

In either case, the idea of false positives carries the day.  With the first reaction, the team assumes that the tool’s results are, by and large, too advanced to be of interest.  In the second case, the team assumes that the results are too fussy.  In both of these, and in the case of other varying reactions as well, the tool is providing more information than the team wants or can use at the moment.  “False positive” becomes less a matter of “the tool detected what it calls a violation but the tool is wrong” and more a matter of “the tool accurately detected a violation, but I don’t care right now.”  (Of course, the former can happen as well, but the latter seems more frequently to serve as a barrier to adoption and what I’m interested in discussing today).

Is this a reason to skip the tool altogether?  Of course not.  And, when put that way, I doubt many people would argue that it is.  But that doesn’t stop people form hesitating or procrastinating when it comes to adoption.  After all, no one is thinking, “I want to add 27,834 things to the team’s to do list.”  Nor should they — that’s clearly not a good outcome.

Fumigation

With that in mind, let’s take a look at some common sources of false positives and the ways to address them.  How can you ratchet up the signal to noise ratio of a static analysis tool so that is valuable, rather than daunting?

Read More

By

Managing Risk via Static Analysis

Editorial Note: I originally wrote this post for the NDepend blog.  Check out the original here, at their site.  While you’re there, take a look at the features of NDepend and download a trial, if you’re so inclined.

When software developer talk about static analysis, it’s often in the context of craft improvement.  Ask most developers in a group about static analysis tools and you’ll get a range of responses, many of which will be fueled by some degree of passion, resulting from past experience.  From here, the conversation will tend to dive into the weeds for any non-technical stakeholder that might be listening; if you’re not a programmer, you probably don’t have much of an opinion as to whether or not cyclomatic complexity of 5 is acceptable for a method.

As a result, static analysis tends to get pegged heavily as a purely a matter of shop.  The topic tends to be pretty opaque to management because developers present it to them in terms of “this will make us better and the code better.”  Management that trusts the developers will tend to agree to the purchase with a sentiment of, “okay, I’ll take your word for it.”  Management that is more skeptical says, “maybe next year if our numbers are good.”

I find this to be a shame because it’s a lost opportunity, even when management agrees.

Static analysis most certainly is a way for developers to improve their craft and their codebases.  But, in the hands of an architect or team lead that truly understands the business and works well with management, static analysis can be an excellent tool for managers, even if the use has to be a management-architect team effort.

How so?  Well, there are a lot of ways, but the one I’d like to mention today is risk management.  As the title would imply, managing risk tends to be the purview of people whose title is manager.  Sure, the developers have responsibility for this, but their primary charter is to build stuff — management exists specifically to engage in planning activities, including the crucial concern of risk management.

How does this work?  Well, I’ll show you, and I’ll do it by explaining the sort of highly technical things that static analysis could catch in highly non-technical and readable ways.  These are all going to be operational risks — static analysis can’t help you if you’re building the wrong product or badly under-staffing your projects.  But it can help you avoid landmines in your software.  If you’re a manager, allow me, for the moment, to serve as your “business-savvy architect.”

Architect

Read More

By

How to Add Static Analysis to Your Process

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’e there, take a look at the other posts and download a trial of NDepend, if you’re so inclined.

As a consultant, one of the more universal things that I’ve observed over the years is managerial hand-waving.  This comes in a lot with the idea of agile processes, for instance.  A middle manager with development teams reporting into him decides that he wants to realize the 50% productivity gains he read about in someone Gartner article, and so commands his direct reports or consultant partners to sprinkle a little agile magic on his team.  It’s up to people in a lower paygrade to worry about the details.

To be fair, managers shouldn’t be worrying about the details of implementations, delegating to and trusting in their teams.  The hand-waving more happens in the assumption that things will be easy.  It’s probably most common with “let’s be agile,” but it also happens with other things.  Static analysis, for example.

Lumbergh

If you’ve landed here, it may be that you follow the blog or it may be that you’ve googled something like “how to get started with static analysis.”  Either way, you’re in luck, at least as long as you want to hear about how to work static analysis into your project.  I’m going to talk today about practical advice for adding this valuable tool to your tool chest.  So, if you’ve been meaning to do this for a while, or if some hand-waving manager staged a drive-by, saying, “we should static some analysis in teh codez,” this should help you get started.

What is Static Analysis (Briefly)?

You can read up in great detail if you want, but I’ll summarize by saying that static analysis is analysis performed on a codebase without actually executing the resultant compiled or interpreted code.  Most commonly, this involves some kind of application (e.g. NDepend) that takes your source code files as input and produces interesting output by running various analyses on the code in question.

Let’s take a dead simple example.  Maybe I write a static analysis tool that simply looks through your code for the literal string “while(true)” and, if it finds it, dumps, “ruh-roh” to the console.  I’m probably not going to have investors banging down my door, but I have, technically, written a static analysis utility.

Read More

By

How to De-Brilliant Your Code

Three weeks, three reader questions.  I daresay that I’m on a roll.  This week’s question asks about what I’ll refer to as “how to de-brilliant your code.”  It was a response to this post, in which I offered the idea of a distinction between maintainable code and common code.  The lead-in premise was that of supposedly “brilliant” code, which was code that was written by an ostensible genius and that no one but said genius could understand.  (My argument was/is that this is usually just bad code written by a self-important expert beginner).

The question is, as follows, verbatim.

In your opinion, what is the best approach to identify que “brilliant” ones with hard code, to later work on turn brilliant to common?

Would be code review the best? Pair programming (seniors could felt challenged…)?

Now, please forgive me if I get this wrong, but because of the use of “que” where an English speaker might say “which”, I’m going to infer that the question submitter speaks Spanish as a first language.  I believe the question is asking after the best way to identify and remediate pockets of ‘brilliant’ code.  But, because of the ambiguity of “ones” it could also be asking about identifying humans that tend to write ‘brilliant’.  Because of this, I’ll just go ahead and address both.

Einstein Thinking Public Static Void

Find the Brilliant Code

First up is identifying brilliant code, which shouldn’t be terribly hard.  You could gather a quorum of your team together and see if there are pockets of code that no one really understands, or else you could remove the anchoring bias of being in a group by having everyone assess the code independently.  In either case, a bunch of “uh, I don’t get it” probably indicates ‘brilliant’ code.  The group aspect of this also serves (probably) to prevent against an individual not understanding simply by virtue of being too much of a language novice (“what’s that ++ after the i variable,” for instance, indicates the problem is with the beholder rather than the original developer).

But, even better, ask people to take turns explaining what methods do.  If people flounder or if they disagree, then they obviously don’t get it, self-reporting notwithstanding.  And having team members not understanding pockets of code is an ipso facto problem.

An interesting side note at this point is whether this illegible code is “brilliant” or “utter spaghetti” is going to depend a lot more on knowledge of who wrote it than anything else.  “Oh, Alice wrote that — it’s probably just too sophisticated for our dull brains.  Oh, wait, you were reading the wrong commit, and it’s actually Bob’s code?  Bob’s an idiot — that’s just bad code.”

De-Brilliant The Codebase

Having identified the target code for de-brillianting, flag it somewhere for refactoring: Jira, TFS, that spreadsheet your team uses, whatever.  Just make a note and queue it as work — don’t just dive in and start mucking around in production code, unless that’s a team norm and you have heavy test coverage.  Absent these things, you’re creating risk without buy in.

Leave these things in the backlog for prioritization on the “eventually” pile, but with one caveat.  If you need to be touching that code for some other reason, employ the boy-scout rule and de-brilliant it, as long as you’re already in there.  First, though, put some characterization tests around it, so that you have a chance to know if you’re breaking anything.  Then, do what you need to and make the code easy to read; when you’re done, the same, “tell me what this does” should be easy to answer for your teammates.

De-brillianting the codebase is something that you’ll have to chip away at over the course of time.

De-Brilliant The Humans

I would include a blurb on how to find the humans, but that should be pretty straightforward — find the brilliant code and look at the commit history.  You might even be able to tell simply from behavior.  People that talk about using 4 design patterns on a feature or cramming 12 statements into a loop condition are prime candidates.

The trick isn’t in finding these folks, but in convincing them to stop it.  And that is both simple to understand and hard to do.

During my undergrad CS major many years ago, I took an intro course in C++.  At one point, we had to do a series of pretty mundane, review exercises that would be graded automatically by a program (easy things like “write a for loop”).  Not exactly the stuff dreams are made of, so some people got creative.  One kid removed literally every piece of white space from his program, and another made some kind of art with indentations.  When people are bored, they seek clever things to do, and the result is ‘brilliant’ code.

The key to de-brillianting thus lies in presenting them with the right challenge, often via constraints or restrictions of some kind.  They do it to themselves otherwise — “I’ll write this feature without using the if keyword anywhere!”

The Right Motivation

Like I said, a simple solution does not necessarily imply an easy solution.  How does one challenge others into writing the kind of straightforward code that is readable and maintainable?

Code review/pairing presents a possible solution.  Given the earlier, “can others articulate what this does” metric, the team can challenge programmer-Einstein to channel that towering intellect toward this purpose.  That may work for some, but other brilliant programmers might not consider that to be a worthwhile or interesting challenge.

In that case, automated feedback through static analysis might do the trick.  FXCop, NDepend, SonarQube, and others can be installed and configured to steer things in the general direction of readability.  Writing code that complies with all warning thresholds of such tools actually presents quite a challenge, since so much of programming is about trade-offs.  Now, a sufficiently determined clever coder could still invent ways to write hard-to-read code, but that would be a much more difficult task when he’d get slapped by the tool for chaining 20 Linq statements onto a single line or whatever.

Of course, probably the best solution is to work with the sort of people who recognize that demonstrating their cleverness takes a backseat to being a professional.  They can do that in their spare time.

If you have a question you’d like to hear my opinion on, please feel free to submit.

By

With Code Metrics, Trends are King

Editorial Note: I originally wrote this post for the NDepend blog.  Head over there to check out the original.  NDepend is a tool that’s absolutely essential to my IT management consulting practice, and it’s a good find for any developer and aspiring architects in particular.  Give it a look.

Here’s a scene that’s familiar to any software developer.  You sit down to work with the source code of a new team or project for the first time, pull the code from source control, build it, and then notice that there are literally thousands of compiler warnings.  You shudder a little and ask someone on the team about it, and he gives a shrug that is equal parts guilty and “whatcha gonna do?”  You shake your head and vow to get the warning situation under control.

Fumigation

If you’re not a software developer, what’s going on here isn’t terribly hard to understand.  The compiler is the thing that turns source code into a program, and the compiler warning is the compiler’s way of saying, “you’ve done something icky here, but not icky enough to be a show-stopping error.”  If the team’s code has thousands of compiler warnings, there’s a strong likelihood that all is not well with the code base.  But getting that figure down to zero warnings is going to be a serious effort.

As I’ve mentioned before on this blog, I consult on different kinds of software projects, many of which are legacy rescue efforts.  So sitting down to a new (to me) code base and seeing thousands of warnings is commonplace for me.  When I point the runaway warnings out to the team, the observation is generally met with apathetic resignation, and when I point it out to management, the observation is generally met with some degree of shock.  “Well, let’s get it fixed, and why is it like this?!”  (Usually, they’re not shocked by the idea that there are warts — they know that based on the software’s performance and defect counts — but by the idea that such a concrete, easily metric exists and is being ignored.)

 

Read More