In this post I hope to dispel some of the mythology and expose the utility of WSJF ('Weighted Shortest Job First"), that strange acronym that seems to pop up when we look deeply into the program and value stream levels of the SAFe 4.0 Architecture. In an enterprise-scale agile transformation, the concept of WSJF is used to devise a formulaic method for arriving at how to prioritize work to maximize efficiency and predictability of value flow. The term was originally coined by Don Reinertsen, whose seminal work, The Principles of Product Development Flow: Second Generation Lean Product Development, was one of the inspirations for the development of the Scaled Agile Framework.
WSJF is intended as a mechanism to model priority from an "economic" standpoint—in other words, to use an ‘economic view’ of what work will bring the most value with the least cost and risk. These values are used to model the priority of functional and non-functional requirements at the program level of The SAFe Big Picture, which then filters down to help decide what user stories are to be injected into teams' backlogs. If you are using the full model of SAFe 4.0, WSJF also comes into play when calculating the priority of capabilities and enablers at the value stream level. For this discussion, we will focus on features for the sake of simplicity.
Generally speaking, WSJF helps answer the question "How do we decide the most valuable work to do first?" There are many factors that enter into this decision, but it boils down to a fairly simple formula:
WSJF = Cost of Delay / Duration
The value created by this formula is a relative value and is used to compare relative values in the same way that story point estimation is used to denote relative complexity. The Cost of Delay term is calculated by the following base formula:
Cost of Delay = User-Business Value + Time Criticality + (Risk Reduction or Opportunity Enablement)
In this formula we use the following terms:
- User-Business Value: Is this something users are clamoring for? What is the cost to market share and profitability if this work is NOT done? What is the potential negative impact on not acting on this sooner rather than later?
- Time Criticality: How does this work impact the overall time scale of delivery? Do users expect this by a certain date? Is there risk that the value will diminish quickly?
- Risk Reduction: Will this work reduce potential risk throughout the value stream? Will it positively impact quality in other areas? Will the reduction be immediate or long-term?
- Opportunity Enablement: Is there a probability that the work will open new avenues of value? Will it attract new kinds of customers?
For each term, we plug in values from the modified Fibonacci Scale (1, 2, 3, 5, 8, 13, 20) to arrive at a relative value for each. As with User Stories, we choose the value of one (1) as the smallest increment of value for each term and estimate the remainder as relative to the smallest value.
Many organizations create their own criteria for calculating WSJF. Each organization might reflect on the efficacy of the base formula and come up with additional values that provide a more accurate value given the specific nature of the value stream.
When we calculate WSJF for features, we will use a grid or spreadsheet with columns that match the terms discussed above. We will go through the table one column at a time and add the values that make up the formula.
Here are a quick list of features that we will use to illustrate the WSJF process:
- Authentication: Login, Logoff, Reset Password
- Authorization: Roles and Access Rules
- User Profile Management: User Name, Roles, Indicative Data
- Transaction Management: Post, Amend, Reject, Accept Transactions
- Reporting: Visualize Transactions, Show User Activity
- Auditing: Comply with Standards
For each feature, we will go through each column and devise estimate values that represent the feature's contribution to the formula. We will again use the Fibonacci Sequence because all of our estimates are meant to be relative. We will look at User-Business Value as it applies to each of our features. As with all Fibonacci estimating, we will figure out the smallest relative value, assign that a value of one (1) and complete the rest of the column before moving on to the next. We might say, for the sake of instruction, that Reporting gives the least value in this case. We can justify this by saying that the information in all reports is available by examining the elements of the UI or by directly accessing the backing store.
Now we estimate the User-Business Value for the remaining features. We identify Authentication as the next highest value item and determine it has three times as much value as Reporting, so we assign that a three (3). Authorization is roughly as valuable as Authentication so we assign that a three (3) as well. User Profile Management is nice, but it is not absolutely essential to deliver value, so that gets a two (2). We decide that Transaction Management is definitely the linchpin of the system and assign that an eight (8). Auditing is important, but not nearly as valuable as most other functionality, so that also gets a two (2). Now our table will look like this:
It is vitally important to remember that you must have at least one cell with a value of one, which represents the least increment of value. You can have more than one cell with the value one if they are all of equivalent value. This establishes the "meaning" of the value one relative to the other features.
Time Criticality is also a Fibonacci value. Lower values will represent lower criticality. We estimate criticality by considering the least critical feature as the unit value one (1) and giving estimates relative to the criticality of a feature with a value of one. Let's say that we decide that Reporting is the least time critical, so we assign that a value of one (1). Going down the column, we decide that Authentication is a two (2), Authorization a three (3), User Profile Management a one (1), Transaction Management a thirteen (13), Auditing a two (2).
Similarly, we evaluate each feature for its Risk Reduction and Opportunity Enablement using the same sequence. User Profile Management pops out as one that does the least for risk reduction, so that becomes a one (1). We decide that Authentication and Authorizations are fairly large risk reductions, so we assign them each a five (5). Transaction Management does not provide much for risk reduction, so that becomes a two (2). Reporting might bring us some new opportunities, so we give that a three (3). Auditing is hugely important for debugging and compliance, so that becomes an eight (8).
Duration is a measure of how long it will take to get a feature through the value flow and is also a relative measure using the Fibonacci values. In most cases it is similar enough to the feature's size estimate for us to use that value as a proxy for the actual duration, using a very similar process for estimating User Stories. We select Auditing as the feature with the smallest size and assign that a one (1). We give estimates for duration relative to auditing. After we complete the last column, our table now looks like this.
With our values in place, we plug in the formula and calculate the WSJF. For our purposes, we will keep two decimals in the value. The rank is from highest WSJF to lowest.
Were you surprised by the result? Perhaps you thought that the "meat" of the system, Transaction Management, would be the first thing into the team backlogs. You could argue that Authorization is quite important, but it is in the middle of the pack. How do we explain this?
There are two reasons why Auditing wins out. First, it has a much smaller duration than Transaction Management. Since Duration is in the denominator, smaller values will yield higher WSJF values. In other words, we favor shorter jobs over longer ones. Also, Auditing was identified as something that is an important risk mitigator. This will increase the value of the numerator of the formula, and larger values will yield higher WSJF results. We favor higher value jobs over ones with smaller value. Reporting, with its low value and criticality, didn't have a chance, although the product may still identify it as part of minimal viability.
WSJF is a guideline—not an algorithm—for deciding what product management and teams ought to be doing. It is encouraged that WSJF be used as part of a conversation about feature priority, not as an absolute rule. It is also OK to be fluid with all of these mathematical terms and adjust them as your product teams see fit.
The ideas behind WSJF can also be used to start a discussion about any backlog: product, team or sprint. Focusing on maximizing value and minimizing duration is the best way to determine how to prioritize work to accelerate agile development.
Some important things to remember when working with WSJF:
- When first getting started, choose the feature with the least relative value for each column, and assign it a one (1). More than one of these is permissible if they are equivalent in value.
- Assign other values relative to the smallest value, and adjust values relative to each other.
- Think one column at a time. Work vertically first, then move to the next column.
- Adjust any value you see fit after all values are entered.
- Calculate WSJF, and rank features by highest WSJF to lowest. Use these as a guideline to feature priority.
- Try to be as accurate as possible with these estimates, but don't worry if estimates are off a little bit. Iterative backlog grooming will allow you to adjust these values more accurately over time.
- WSJF favors features that are of highest value and lowest duration.
Are you implementing SAFe? Thinking about it? Summa's Agile Transformation Coaches can help you make it happen. For questions about WSJF, SAFe or anything else regarding lean/agile practices, please reach out to us at firstname.lastname@example.org.