Agile Release Train Metrics

I’ve had an On-again, Off-again, love-hate relationship with software metrics throughout my career. The On part is:

  • of course we have to measure ourselves, less we  a) not advance the science of software engineering in any credible way, and b) not provide any professional, quantitative input to the fiduciaries and key stakeholders who ultimately determine the economic benefit of our work (and indeed decide to pay us or not)

the Off part is:

  • a)   why don’t any of these metrics we’ve traditionally applied actually seem to work?, and b) can’t all of these be used by management to drive behaviors that may eventually be counterproductive, and/or c) be manipulated by teams to some similar, ill result?

But the On part of this debate within myself continually informs me that “measure we must” , particularly if we want to be credible with our agile model in the enterprise setting. So, I took my best shot at project, process, and an enterprise “balanced scorecard approach” to metrics in Chapter 22 of Scaling Software Agility. I reviewed it today and it still seems credible (at least to me).

Recently, however, I’ve been repeatedly faced with suggesting metrics that can work well in the context of each PSI/release in an Agile Release Train. That’s a simpler discussion, because there, context is known, and the measurement problem is focused on a specific program, with specific teams, in a specific technical and business context. To that end, I’ve recently been presenting the following set of metrics to the release train programs and Agile Working Groups I’m involved in.

A set of reasonable metrics for an agile release train

At each PSI, each team self assess and report on the above, and the aggregate can be rolled up into a program level summary, which should establish some sense of health and progress for that train.

Most of these are fairly obvious to agile teams. I described least obvious one, “percent of business value” achieved in this earlier post.

Hope this helps at least somebody.


A Lean, Economic Approach to Prioritizing Work

Last year, when I was writing Agile Software Requirements, I was researching various strategies (algorithm is too strong a term) for prioritizing user stories, and perhaps more importantly the features of a solution, and even higher on the economic food chain, the portfolio epics that drive the solution roadmap. (That’s where the real economic leverage comes – from the portfolio level.)

I admit that I was never much impressed by what I had read or written on the topic of prioritizing requirements prior (ok, so delighters are better than satisfiers, who can say what is one and what is the other, or which MMF to do first?), so I wanted to take a fresh look.

About that time, I came across Reinersten’s latest book, Principles of Product Development Flow, and I quickly gravitated to his lean, flow-based model for prioritizing work, based on the Cost of Delay, and which he summarizes in a model which he calls Weighted Shortest Job First (WSJF). I also was able to attend one of his workshops, so I had a chance to ask a few relevant questions, which improved my understanding. While Reinersten’s book is not about software per se, the lean economic principles apply extremely well in agile, so prior to writing the book, I started using the WSJF method in various classes and consulting engagements.

So that you don’t have to go through all the laborious history and derivations that I seem to bludgeon readers with, I’ve summarized the model that evolved over time below. (however, if you are serious about software economics, you’ll need to understand the fuller model.)  Here’s the synopsis picture from one of my current presentations:

Here’s what we recommend for Cost of Delay for software:

And finally, here’s the spreadsheet I like to use for prioritizing jobs (in our case – epics, features, and stories).

“Simply” you rate each job relative to other jobs on each of the three parameters, divide by the size of the job (which as equivalent to length of the job in most cases)  (oops?…your backlog is estimated right?)  and come up with a number that rates the job priority (note: because of the way we’ve set up the relative rating system, higher numbers have higher economic priority).

And guess what, while the results often surprise people, it seems to work really well. After a bit, people really get it and then they realize they can use the same model to prioritize work, based on lean economics, at any level of the Big Picture (Scaled Agile Delivery Model).  This creates a common basis for economic decision making throughout the enterprise. Potentially, THAT is a pretty big deal.

One advantage of the model is that really-big-important jobs (like architectural and business epics) have to be divided into smaller-pretty-important jobs, in order to make the cut against easier ways of making money (i.e., small, low risk features that your customers are willing to pay for now). Now that’s agile thinking.

Also, a subtlety of the model is that we do NOT have to determine the absolute value of any of these numbers, as that’s a fool’s errand. Rather, we only need to rate the parameters of the job against other jobs from the backlog. After all, what credible agile program or team doesn’t have a robust backlog to choose from?

My colleague and agile ninja, Chad Holdorf has been putting this same model to work as well, and he recently did a video blog post on the topic. You can find it here.

You can also find a deeper explanation in ASR (but then of course you’d likely to have to buy a copy.)

Note: I have had some questions about the model, particularly as to whether or not  it is overkill for user stories at sprint boundaries. After all, there, the size of the job is about the same, (all stories are sized to fit in an iteration, typically 3-5 points) and the time value of the job may have already been determined by higher-level work prioritization, including dependencies with other teams. If that is the case, the model deprecates to a simpler equation:

* Deprecated WSJF model, assuming story size is about the same and that the stories have reached an iteration boundary based on external factors or other logical sequencing. If that is not the case, use the full model!

which probably looks pretty familiar to many of you.