DaedTech

Stories about Software

By

Why NDepend Uses Google’s Page Rank

Editorial note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, have a look at type rank and all of the other metrics that NDepend will show you about your code.

I remember my early days of blogging as sort of a comedy of errors.  Oh, don’t get me wrong.  I don’t think those early posts were terrible, since I’d always written a lot.  Rather, I knew very little about everything besides the writing.  For example, I initially thought link spammers were just somewhat daft blog commenters.  I stumbled through various mistakes and learned the art of blogging in fits and starts.  This included my discovery of something called page rank.

Page rank had a relatively involved calculation, but that didn’t interest me at the time.  Instead, I found myself dazzled by some gamification.  Sites like this one would take your domain and a captcha as input and spit out a score from 0 to 10 as output.  That simply, they turned my blogging world upside down.  I now had a score to chase and a means of comparing myself against others.  And I vaguely understood that getting more inbound links would increase my page rank score.

Of course, as an introvert, I struggle with outgoing self-promotion.  Cold outreach to people to see if they’d link to me never seriously occurred to me.  Instead, I reasoned that I would play the long game.  Write enough posts, and the shares start to come.  And then when the shares come, so too will the links.  So I watched my page rank inch slowly upward over time.

The Decline of Page Rank

My page rank ticked upward until one day it didn’t anymore.  Turns out, Google slowly killed it over the course of a number of years.  Ten months passed between its penultimate update and its final one.  So there I stood (metaphorically), waiting for a boost to my rank that would never come.

But why did Google kill page rank?  Wouldn’t such an easily digestible construct continue to help people?  Well, sort of.  Unfortunately, it disproportionately helped the wrong sort of people.

The Google founders developed the concept during their time at Stanford.  Conceptually, the page rank algorithm regards a link from site A to site B as a “vote” for site B, by site A.  But not all pages get to “vote” equally.  The higher a rank the page has, the more worthwhile its vote, creating a conceptual feedback loop.

On the surface, this sounds great, and, in many ways, it was.  As you can imagine, a site with a ton of inbound links, like a government study or a news outlet, would accumulate a great deal of rank.  Since employees would carefully curate such sites, you could put a lot of stock in a site to which they linked (and search engines did).  So in theory, you have a democratized system in which the sites best regarded by the public had the best rank.

But in this theory, no link spammers existed.  If you wanted good page rank, you could produce high quality, popular content.  Or you could pay some shady outfit to carpet bomb blog comment sections with links to your site.  Because of this fatal flaw, page rank eventually dwindled to obscurity.

A Useful Reappropriation of Page Rank

For clarity, understand that Google (probably) still uses some incarnation of this scheme.  But they no longer update the easily consumed public version of it.  They now use it as only one of many factors in what they display in response to searches.  The heyday of comparing page rank scores for sites has come and gone.  But that doesn’t mean we can’t use it elsewhere, and to great efficacy.

For instance, consider applying this to codebases.  Instead of a situation where website A links to website B, imagine a situation where type A refers directly to type B.  Now, imagine your codebase as a (hopefully acyclic) directed graph with edges and nodes.  You start to have an interesting vehicle for reasoning about your codebase.

What would a high rank mean in this context?  Well, relatively high rank for a type would mean that other types tended to refer to it at a high rate.  Types with relatively low (or zero) rank would take no dependencies, existing at the edge of your code.  And the types with the highest rank?  These would be types used by other types with high rank.

Read More

By

What Metrics Should the CIO See?

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, give NDepend a try — download it and see if your code falls in the dreaded Zone of Pain.

I’ve worked in the programming industry long enough to remember a less refined time.  During this time, the CIO (or CFO, since IT used to report to the CFO in many orgs) may have counted lines of code to measure the productivity of the development team.  Even then, they probably understood the folly of such an approach.  But, if they lacked better measures, they might use that one.

Today, you rarely, if ever see that happen any longer.  But don’t take that to mean reductionist measures have stopped.  Rather, they have just evolved.

Most commonly today, I see this crop up in the form of automated unit test coverage.  A CIO or high level manager becomes aware of generally quality and cadence problems with the software.  She may consult with someone or read a study and conclude that a robust, automated test suite will cure what ails her.  She then announces the initiative and rolls out.  Then, she does the logical thing and instruments her team’s process so that she can track progress and improvement with the testing initiative.

The problem with this arises from what, specifically, the group measures and improves.  She wants to improve quality and predictability, so she implements a proxy solution.  She then measures people against that proxy.  And, often, they improve… against that proxy.

If you measure your organization’s test coverage and hold them accountable, you can rest assured that they will improve test coverage.  Improved quality, however, remains largely an orthogonal concern.

The CIO’s Leaky Abstraction

The issue here stems from what I might call a leaky organizational abstraction.  If the CIO came from a software development background, this gets even more thorny.

Consider that a CIO or high level manager generally concerns himself with organizational strategy.  He approves and monitors budgets, signs off on major initiatives, decides on the fate of applications in the application portfolio, etc.  The CIO, in other words, makes business decisions that have a technical flavor.  He deals in profits, losses, revenues, expenses, and organizational politics.

Through that lens, he might look at quality problems across the board as hits to the company’s reputation or drags on the bottom line.  “We’re losing subscribers due to these bugs that happen at each roll out.  We estimate that we lose $10,000 more each month in revenue.”  He would then pull the trigger on business solutions: hiring consultants to fix this problem, realigning his org chart, putting off milestones to focus on quality, etc.

But if he dives into the weeds, he’s shedding a business person’s hat for a techie’s.  “Move over architects,” he says, “I know how you can fix this at the line level.  I call it ‘automated test coverage’ and I order you to start doing it.”  In a traditionally organized corporate structure, the CIO begins doing the job of folks in his organization at his peril.

Read More

By

Entering the Zone of Pain

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, download NDepend and see if your code falls into the infamous Zone of Pain.

Years ago, when I first downloaded a trial of NDepend, I chuckled when I saw the “Abstractness vs. Instability” graph.  The concept itself does not amuse, obviously.  Rather, the labels for the corners of the graph provide the levity: “zone of uselessness” and “zone of pain.”

When you run NDepend analysis and reporting on your codebase, it generates this graph.  You can then see whether or not each of your assemblies falls within one of these two dubious zones.  No doubt people with NDepend experience can recall seeing a particularly hairy assembly depicted in the zone of pain and thinking, “I knew it!”

But whether you have experienced this or not, you should stop to consider what it means to enter the zone of pain.  The term amuses, but it also informs.  Yes, these assemblies will tend to annoy developers.  But they also create expensive, risky churn inside of your applications and raise the cost of ownership of the codebase.

Because this presents a real problem, let’s take a look at what, exactly, lands you in the zone of pain and how to recover.

Read More

By

Alternatives to Lines of Code

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, download NDepend and give it a try — see if your code lies in the Zone of Pain.

It amazes me that, in 2016, I still hear the occasional story of some software team manager measuring developer productivity by committed lines of code (LOC) per day.  In fact, this reminds me of hearing about measles outbreaks.  That this still takes place shocks and creates an intense sense of anachronism.

I don’t have an original source, but Bill Gates is reputed to have offered pithy insight on this topic.  “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”  This cuts right to the point that “more and faster” does not equal “fit for purpose.”  You can write an awful lot of code without any of it proving useful.

Before heading too far down the management criticism rabbit hole, let’s pull back a bit.  Let’s take a look at why LOC represents such an attractive nuisance for management.

For many managers, years have passed since their days of slinging code (if those days ever existed in the first place).  So this puts them in the unenviable position of managing something relatively opaque to them.  And opacity runs afoul of the standard management playbook, wherein they take responsibility for evaluating performances, forecasting, and establishing metric-based incentives.

The Attraction of Lines of Code

Let’s consider a study in contrasts.  Imagine that you took a job managing a team of ditch diggers.  Each day you could stand there with your clipboard, evaluating visible progress and performance.  The diggers that moved the most dirt per hour would represent your superstars, and the ones that tired easily and took many breaks would represent the laggards.  You could forecast milestones by observing yards dug per day and then extrapolating that over the course of days, weeks, and months.  Your reports up to your superiors practically write themselves.

But now let’s change the game a bit.  Imagine that all ditches were dug purely underground and that you had to remain on the surface at all times.  Suddenly accounts of progress, metrics, and performance all come indirectly.  You need to rely on anecdotes from your team about one another to understand performance.  And you only know whether or not you’ve hit a milestone on the day that water either starts draining or stays where it is.

If you found yourself in this position suddenly, wouldn’t you cling to any semblance of measurability as if it were a life preserver?  Even if you knew it was reductionist, wouldn’t you cling?  Even if you knew it might mislead you?  Such is the plight of the dev manager.

In their world of opacity, lines of code represents something concrete and tangible.  It offers the promise of making their job substantially more approachable.  And so in order to take it away, we need to offer them something else instead.

Read More

By

How to Get an Edge As a Consultant

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, have a look around at some of the documentation around code metrics and queries.

I’ve made no secret of, and even frequently referred to my consulting practice, including aspects of IT management consulting.  In short, one of my key offerings is to help strategic decision makers (CIOs/CTOs, dev managers, etc) make tough or non-obvious calls about their applications and codebases.  Can we migrate this easily to a new technology, or should we start over?  Are we heading in the right direction with the new code that we’re writing?  We’d like to start getting our codebase under test, but we’re not sure how (un) testable the code is — can you advise?

This is a fairly niche position that’s fairly high on the organizational trust ladder, so it’s good work to be had.  Because of that, I recently got a question along the lines of, “how do you get that sort of work and then succeed with it?”  In thinking about the answer, I realized it would make a good blog post, specifically for the NDepend blog.  I think of this work as true consulting, and NDepend is invaluable to me as I do it.

Before I tell you about how this works for me in detail, let me paint a picture of what I think of as a market differentiator for my specific services.  I’ll do this by offering a tale of two different consulting pitfalls that people seem to fall into if tasked with the sorts of high-trust, advisory consulting engagements.

LikeABoss

Read More