An Agile Illusion: How That Nice Backlog is Actually Decreasing Your Team’s Agility

The Queue at Starbucks

I was sitting at a Starbucks in Munich – a rainy, snowy, Saturday morning – watching the queue of people looking for their caffeine kick-start. (Same reason I was there). For some reason, the queue of people ordering coffee reminded me of the question teams ask about “how big” and “how well formed” (or how well-elaborated) their product backlog needs to be. Of course, I always have the consultant’s-standard-ready-answer of “it depends”, hopefully followed by a more meaningful discussion.

I’m writing this post to force me to think through and document a better answer; one that includes the “what it depends on” part.

The size of the team’s backlog, the rate of backlog processing, and the ultimate rate of value delivery, is really a problem of queuing theory, just like at Starbucks. At Starbucks, I was trying to do Little’s Law in my head, but to not very good effect, so I had to write it out when I came back to my hotel. Little’s Law, the general-purpose theory for queuing and processing problems, is one of the fundamental laws of lean. It states:

where:

Wq is the average time in the queue for a standard job
Lq is the average number of things in the queue to be processed
The denominator  (Lambda) is the average processing rate for jobs in the queue

While I was drinking my latte, I noticed that the queue at the ordering counter varied from zero to as many as 12 people in line. With my iPhone timer, I was methodically trying to time the average time that it took the single barista to serve each customer. However, when the queue got long, as if by magic, another barista appeared from the back somewhere (Starbucks likely understands queuing theory). That confused my timing and I lost track.

However, let’s assume it takes about 45 seconds to serve a customer or a service rate of 1.33 customers per minute. We can use Little’s Law to calculate the average wait of someone in the queue on this particular Saturday morning as follows:

Wq = 6 ( Lq, average number of customers in line over the period) / 1.33 (Lamda, number of customers we could serve per minute)

So Wq is 4.5 minutes in this case (not too bad, even if you need that quick fix).

It’s important to note that this is the average case, and your wait could be shorter (I was the third person when I went in, and I only had to wait a minute or two to order) or longer (the person at the back of the 12 person queue had to wait about 9 minutes; yikes, he could easily go somewhere else).

Little’s Law and an Agile Team’s Backlog

Ok, fair enough, but what does the line at Starbucks have to do with agile development and the team’s requirements backlog? It is the same problem. Let’s assume:

  • a single agile/scrum team, working in two week iterations
  • The team averages about 25-30 story points per iteration, or a story completion rate of about 8 stories per iteration
  • The team is justifiably proud of how well they are maintaining their backlog, and it contains about 100 stories.

The question is, how agile is this team? In other words, how long does it take, on average, for a new requirement (story) to get to the end of an iteration, where it can start to deliver value (assuming no delay in release, a topic for later)? First to calculate the processing rate per story, we convert 8 stories per iteration into .125 iterations per story , and  plug this data into Little’s Law:

The answer is 12.5 iterations to get into the sprint, plus two weeks to get out, or 27 weeks on the average. More than half of a year! And, if the backlog varies in size, it could take even longer (remember the guy who had to wait nine minutes).

Wow! If it takes a single team a half a year, on average, to deliver a new requirement to the customer, that sure doesn’t seem very agile to me!

Plus, in the enterprise, there are multiple teams with interdependencies, and the individual results of the teams have to be aggregated, packaged, and validated in some kind of release envelope before distribution, so it can take longer still. Therefore, it’s understandable when we see an enterprise with 20, 50, or even 100 reasonably agile teams that still takes 300-500 days to move a new requirement from customer request to delivery.

So yes, it may be understandable, but it’s not acceptable. Let’s see what we can do about it.

Applying Little’s Law to Increase Agility and Decrease Time-to-Market

The formula is not complicated. If we are going to improve (decrease) time to market, we have to either increase the denominator or decrease the numerator.

And of course, if we can do both we will achieve even better results. Let’s look at each opportunity.

Increasing Lambda, the Rate of Story Completion.

Increasing the rate of story completion, and thereby the overall rate of value delivery, is the legitimate goal of every agile team. Of course, if we could simply add resources, we could probably increase Lambda, but for the purpose of this discussion, let’s assume that is impractical. Besides, while it’s the easy way out of the argument, it increases the cost of the value created, and decreases ROI. So, while you might get there faster, you may not make any money when you do. Let’s work within the constraints of our archetypical team and see what we can do.

The primary mechanism for increasing the rate of story completion is the team’s inspect and adapt process, whereby the teams review the results of each iteration and pick one or two things they can do to improve the velocity of the next. This is the long-term mission; a journey measured in small steps, and there is no easy mathematical substitute for such improvements. These improvements include better coding practices, unit testing and unit testing coverage, functional test automation, continuous integration, and other enhanced agile project management and software engineering practices.

In my experience, however, two primary areas stand out as the place where teams can get the fastest increase in Lambda. First, gaining a better understanding of the story itself before coding begins, and second, decreasing the size of the user stories contained in the backlog.

Gaining a Better Understanding of the Story – Acceptance Test-Driven Development

The fact is that the overall velocity of the team is not typically limited by the team’s ability to write, or even integrate, code. Instead, it is gated by the team’s ability to understand what specific code they need to write, as well as to avoid code they do not need to write. Doing so involves having a better understanding of the requirements of the story, before coding it.

However, this must be done on a just-in-time basis, just prior to the iteration boundary, or else the team’s backlog will get wider and the team will have too much requirements inventory. Some of it will likely decay before they get to it. However, a wider backlog is not nearly as bad as a longer backlog. The worst case is that a few team members have gone too far, too early, in elaborating a few backlog items, but it won’t slow value delivery nearly so badly as would a longer backlog. It’s a bit of waste, but it doesn’t really drive Little’s Law.

Therefore, once a story has reached a priority whereby it will be implemented in the next iteration or two, time spent in elaborating the story will pay dividends. Often, this is described as Acceptance Test-Driven Development, (ATDD) and, fortunately, it’s a little easier for teams to intellectualize and adopt than code-level TDD. ATDD involves two things: 1) writing better stories and 2) establishing the acceptance tests for the story before coding begins.

Fortunately, we don’t have to cover all that here, as I described writing better stories in the User Story whitepaper and how to develop Acceptance Criteria in the draft book chapter.

Increasing Lambda with Smaller Stories

If all the people ordering at Starbucks had ordered a tall, black coffee, rather than a Venti, non-fat, double shot, half-caff, no foam, vanilla latte, along with a heated bagel on the side, the length of the queue and the wait from the back of the line would have been much shorter. Small jobs just go through a system faster than large ones.

In the User Story whitepaper, I described the benefits of smaller user stories at length, and I won’t repeat them here. However it’s worth pointing out that decreasing the size of user stories has both a linear and exponential effect on Lambda, both of which are positive:

Linear effect. Smaller user stories are just that, smaller. They go through the iteration faster so teams can implement and test more small user stories in an iteration than large ones. And, while the total value of a small story can’t be as big as a large story, the incremental delivery hastens the feedback loop, improving quality and fitness for use.

Exponential Effect. Because they are smaller and less complex, small user stories decrease the coding and testing implementation effort exponentially. The coded functions are smaller and less complex, and the number of new paths that must be tested also decreases exponentially with story size.

So, even if the length of the backlog remains the same, a combination of better defined and smaller user stories has a positive effect on value delivery time.

Now that we’ve seen two ways to decrease time to market (Wq) by increasing the denominator (Lambda) of our equation, let’s look at the other half of our equation and see what opportunities we find there.

Decreasing Lq, the Length of the Queue

Clearly, increasing Lambda, the denominator of Little’s Law, is a prime opportunity for the team to increase agility and time to market. Every truly agile team continuously commits to doing so.

However, there are even faster ways to decrease time-to-market, while they are working on continuously improving development practices. That is by forcing a limit on the length of the queue.

We can see in our formula, that decreasing queue size has a directly proportional decrease in the wait time, Wq. Therefore, if we cut our queue size in half, we can halve our time to market without taking any further action. In our ultimate case, if we reduce the queue size to below eight or so, we could deliver every story in the next iteration.

As if this weren’t enough motivation, in his new book, The Principles of Product Development Flow, Don Reinertsen points out that there are a number of additional reasons why long queues are fundamentally bad in the product development process:

  1. Increased Risk. While a story is in the queue, there is some probability that the market or customer has changed their mind and the story is no longer valuable. The longer the queue, the higher the probability. When we develop an unneeded story, we waste valuable resources. Worse, the unneeded story has displaced some other story that would have had economic benefit.
  2. Increased variability. With a long queue, there is always way more than enough work to do so the team takes on everything they possibly can. Management supports this by driving teams to high utilizations (95% or better). In turn, high utilization drives high variability, as Figure 1 illustrates. High variability decreases reliability, causes stress in the organization, and perversely, drives even higher utilization due to fire fighting.
  3. Increased costs. Every story in the team’s queue was put in there somehow by someone. That takes labor. Once it’s in there, the team has to continue to account for it, prioritize it, and rearrange it as higher priority items come into the queue.
  4. Reduced Quality. The longer the queue, the longer it is before we get feedback from the customer (or product owner proxy). The longer the feedback, the more other developers may have invested in the nonconforming story, and the more expensive it is to rework.
  5. Reduced motivation and initiative. If it’s going to be a long time before a customer sees a story in the middle of the queue, there is little sense of urgency for the product owner or team. If they are going to see it soon, we better worry about getting it right, right now.

For these reasons, many teams have simply decided to place arbitrary limits on the size of the backlog, sized as necessary to create the desired time-to-market or internal feedback loop. Once the queue is full, they quit even thinking about new stories until there is room for more stories in the queue.

Software Kanban Systems

Indeed, the devastating effect of long software queue sizes is the primary economic and philosophical principle that drives the current lean software kanban movement.

As the (awfully cutely named) Limited WIP Society describes:

  • Kanban manages the flow of units of value through the use of work in process (WIP) limits.
  • Kanban manages these units of value through the whole system, from when they enter, until they leave.
  • By limiting WIP, Kanban creates a sustainable pipeline of value flow.
  • Further, limiting WIP provides a mechanism to demonstrate when there is capacity for new work to be added, thereby creating a pull system.
  • Finally, the WIP Limits can be adjusted and their effect measured as the Kanban System is continuously improved.

At a further level of sophistication, David Anderson describes utilizing various classes of service as follows:

Expedite. Unacceptable cost of delay
Fixed delivery date. Step function cost of delay
Standard class. Linear cost of delay, and
Intangible. Intangible cost of delay.

Each class of service has its own work in process limits, which can be adjusted based on current context, and associated with different management policies. With kanban, teams have a structurally sound basis for decreasing Lq, the length of the backlog, and can thereby reap the benefits accordingly.

Summary

Prior to reading this post, I suspect that many readers thought that a moderately lengthy, well-articulated, agile product or project backlog was an asset to the agile team. Common sense and intuition may have led us to believe that was the case. Many believed that such a backlog increased, rather than decreased, the team’s agility and the rate of value delivery to the customer. Indeed, I suspect there was a time when my intuition (and perhaps the baggage of the waterfall model) told me that too.

But the economics and math behind queuing theory, lean manufacturing, and lean product development teach us otherwise. Instead, we’ve learned that agile teams need short backlogs of small items, a number of which are quite well articulated and socialized, but only just prior to the iteration boundary in which they will be implemented. Acceptance Test-Driven Development is one method that helps us accomplish that.

Forcibly limiting the length of the backlog helps immensely also, as smaller backlogs have smaller average wait times, and therefore our customers don’t have to wait as long for value delivery. In addition, since we know that the incremental value of a feature decreases with time, small backlogs are a primary mechanism with which to achieve maximum possible ROI. Plus, it’s more fun for the team to deliver fast, too.

Bibliography

Reinertsen, Donald. The Principles of Product Development Flow. Redondo Beach, CA: Celeritas Publishing, 2009.

Note: there are a number of interesting comments below. If you find the thread interesting, follow it to this follow-on post : https://scalingsoftwareagility.wordpress.com/2010/01/10/more-on-lean-backlog-and-littles-law/


7 thoughts on “An Agile Illusion: How That Nice Backlog is Actually Decreasing Your Team’s Agility

  1. This is all about prioritizing. Either customer or the team – they have to decide what to implement first. Backlog is more like a notepad with all possible ideas waiting for their implementation – lengthy backlog is not a disease. It shows that there’s lots of work waiting to be prioritized and worked on! You can’t force release time with the single aim to make your backlog shorter. Backlog is a tool for work, not the absolute indicator of team’s agility or capability.

  2. All of your suggestions are good. The underlying assumption that the entire backlog must be completed, however, is flawed in my view. People do indeed halve their time to market (or better) by having a reasonably long backlog, and dropping a release after less than half that backlog has been completed. If you want to frame it in traditional project terms, your backlog isn’t a project: it defines what business value is available to put into a project. The project ends when a delivery occurs, not when the backlog is emptied, and if you’ve managed the backlog well, you have delivered the stories with maximal business value first.

    In particular I think your “Reduced Quality” and “Reduced motivation” points are flawed: no agile practice would suggest that you avoid customer feedback until the backlog is empty, and having code ready to deploy at each iteration boundary means you *have* to get it right the first time.

    You are absolutely right that there is coordination overhead for each backlog item, but I have found that with a little guidance, the most valuable stories float to the top. Being able to do that is what makes the Product Owner role in scrum so valuable. Without a good Product Owner, the next best thing is indeed to have a shorter backlog: it’s easy for your customers to get blurred vision if you dangle too many options in front of them. Your point about Increased Variability is also true if you don’t have guided optimization of the backlog. Ironically, the Increased Risk you mention is actually an advantage in a large backlog that is well-managed: transient requests never get implemented, saving both time and money.

  3. Fabulous blog post, Dean, thank you! I agree: managing requirements and increasing throughput and value on agile projects means:

    * a dynamic yet sparsely- or at least reasonably-populated backlog (with items in varying sizes, so having MMF-sized items – minimal marketable features – is fine)
    * analyzing the requirements/stories to arrive at very slim/right sized stories
    * having very sharply and clearly defined “doneness” for those sliced-down stories
    * doing just enough ‘work ahead’ on the items to have what Jeff Sutherland calls a ‘ready’ backlog items

    To get to these practices, it has been my experience that the best starting place is story slicing.

    That can be complex – it requires some quick analysis of “options” for slicing, dependency analysis, and layering of nonfunctional requirements. that is then followed by customer/product owner prioritization.

    This slicing is quite useful for learning: it deepens the teams’ understanding of the family of stories and enables them to sequence them in a way that balances value with architectural dependencies.

    And, it allows enables the customer to make smarter choices so they can balance value with technical/requirements/architecture risk.

    Thanks for the great post.

  4. Nice post Dean.

    I wholly agree that the length of a product Backlog is an indication of how agile a team really is. Even if you are able to release every day, an item that is on the backlog for months carries risk and cost. Fortunately, unlike at Starbucks, features can push to the front of the queue. But this is just Class of Service, which should be made explicit, and can sometimes be a smell in itself (e.g. too much expediting)

    Karl

  5. Although I do find Little’s Law to be helpful, there are a couple of issues with the foundation of this article, in my opinion:

    1. The Product Backlog is NOT scope
    Although some organizations and teams treat it as scope, that is not a healthy way to manage a PB. This approach asserts that we can think of the most important items in order of their importance over time. Prioritization is key in PB management and therefore 6 months to deliver an item should not be part of the equation.

    2. PB items can be removed from PB
    Some PB items are just left on the backlog and I do think it is a good idea to manage those right out of the backlog if they stay too long. Obviously they are just there because we can’t see our PB without it or to keep a stakeholder off our back. Either way, this, I would consider, is a waste to manage around.

    3. The Development Team is involved in helping manage PB
    It is important that we do have a forward looking PB so that the development team can help PO design their product’s future state. Sometimes this means that they will give the PO creative ways to implement sets of feature ideas to minimize their cost and reduce the number of items necessary. Sometimes it will mean that there is a need to expand the story outwards to encompass technical infrastructure and interactions with existing features. This also does not happen on a linear time scale so sometimes items will be there for a time before they are elaborated or modified.

    4. All PB items are not elaborated when put into PB
    Some items are just sitting at the bottom and have not been elaborated further than the speck of an idea they describe. This is a placeholder and could represent a bad idea to be thrown out sometime in the future. It could be that the PO describes it to the development team and through their discussions a decision is made to remove it from the PB. Other times it could lead to innovation in the product. Either way, it serves a purpose in the learning stemming from these discussions.

    Thank you for the article because it did make me think further about the PB.

  6. I agree with your suggestions but not your premise. The backlog is a stack-ranked collection of possible work – that is a collection of options, things we might work on but have not yet committed to.

    Not until Sprint Planning should these be converted into a defined set of work (e.g. commitments). Unlike a queue at Starbucks not everything in the backlog should be completed (in fact I’d say most of it should not). And you want the relative order of things in the backlog to be as mutable as possible up to the point of commitment so the business can be as responsive as possible to changing conditions.

    Velocity is useful to the business because it allows for some upfront estimating of the relative cost of a specific implementation. Increasing velocity is of course a valid goal for the development team as well very desirable for the business team, however the accuracy of the estimate is more important in the decision making (e.g. stack ranking) process than the speed because it helps you determine cost.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s