
John Hayes
Senior Product Marketing Manager, SquaredUp
…is that there are no rules. DORA defines the measures, you need to figure out how to implement them.
Senior Product Marketing Manager, SquaredUp
DORA Metrics are widely regarded as the gold standard for measuring the performance of software development teams. The metrics themselves though are generic, high-level pointers – they are not an instruction manual. Adopting the DORA approach is the first step down the path to continuous improvement. The next steps are deciding how the measures should be defined in the context of your own organisations processes and then figuring out how to retrieve (and present) the relevant data.
On the face of it, implementing DORA Metrics might sound simple. In practice, the meaning of a particular measure could be open to many different interpretations. The Mean Time To Repair metric, is a great example of this problem.
The official DORA definition for this metric is as follows:
"The average time it takes to restore service when a service incident or a defect that impacts users occurs."
This simple statement can lead to a whole slew of questions. At what point does the clock start ticking for measuring this? When the error first occurred? When the error was first reported by a user? When the error was first assigned to a developer?
That is just for starters – there is also plenty of room for discussion of what represents a “service incident” or how we quantify user impact. Ultimately, there is no right or wrong answer to these questions. Your particular interpretation needs to be one that works for your teams, your systems and your operating procedures.
At SquaredUp, we have an out-of-the-box DORA Metrics dashboard that is bundled with our Azure DevOps plugin. Since there is no one-size-fits all definition for implementation, many customers use this as a template for building their own custom dashboards. In this article we will look at both how you can tweak the Azure DevOps query as well as how you can combine results from different data sources.
Since this is an Azure DevOps plugin, it assumes that you are tracking bugs as Azure DevOps Work Items. It uses this query to retrieve work items:
SELECT
[System.Id],
[System.WorkItemType],
[Title],
[Created Date],
[Closed Date],
[System.TeamProject]
FROM workitems
WHERE [System.WorkItemType] = 'Bug'
AND [Priority] = '1'
AND [Severity] = '1 - Critical'
AND [Created Date] > @Today - 7
AND [System.TeamProject] = @project
Obviously, this query uses a number of criteria that may not match the definition used in your organisation. It contains assumptions about:
· Priority
· Severity
· TimeSpan
Tweaking this is very straightforward. For example, we could use this expression if we wanted to set the timespan to the last 28 days:
“AND [Created Date] > @Today – 28”
Equally, the DORA metric definition encompasses any issue that has a user impact – not just Critical Errors. This would include issues such as performance degradation or a non-critical UI issue. We could use this expression to include Bugs with a severity of Medium or Higher:
AND [Severity] in ('1 - Critical', ‘2 - High’,’3 - Medium’)
Our out-of-the-box DORA dashboard assumes that you are using Azure DevOps for all of your build pipelines, work items and Git repos. In reality, many teams spread these functions across a mixture of tools. What if your Work Items were in Jira? Not a problem! With SquaredUp it is easy to mix and match multiple data sources in the same dashboard.
Let us look at how you could calculate MTTR if your Work Items were in JIRA rather than in Azure DevOps. We are going to do this in two stages. Firstly, we will use Jira Query Language (JQL) to create a data set consisting of bugs that have been resolved in the past 28 days. In the second stage we will run a SQL query over this data set to calculate the average date difference (in hours) between an issue being opened and resolved.
Again, this definition could be further refined depending on the ways of working in your operation. Resolved might mean that the bug has been merged into a production branch or it could mean that it has been deployed to production and verified as working.
We will start by adding a new tile to our dashboard and then adding the Jira data source and selecting the JQL Query Data Stream:
In the Configure Parameters step, we will enter a JQL query to return any Medium, High or Critical bugs that have been resolved in the past 28 days:
By default, the data set will be saved as ‘dataset1’. That is obviously not very meaningful, so we are going to change the name to ResolvedItems – and that is the name we will use to reference it in our queries.
Next, we need to create a SQL query which will give us the average time to resolve the issues in our data set. To do this we will toggle the SQL Analytics option, which will give as a dialog box to enter our SQL query. The underlying query engine uses DuckDB and the syntax for some functions may differ from flavours of SQL such as T-SQL, which is used by Microsoft SQL Server. This is the query we will use to calculate our average:
And we can now render our total as a Scalar visualisation:
As we have said, DORA metrics can be seen as a signpost, but it is up to each organisation to map out their own route to the destination. This will depend on a number of factors, such as the culture of individual teams, the technologies being used and the priorities of engineering managers.
Our out-of-the-box dashboard gives you a great start for visualising your DORA metrics. On top of that foundation, you can draw on our wide range of data sources as well as our powerful querying tools to fine tune your dashboards to the precise needs of your organisation.