At SquaredUp, we have been debating how the senior leadership team can monitor the ‘health’ of our engineering teams. To do this, we decided to create a dashboard that could represent this for a team – but first, we needed to figure out what to measure.
Our goal was to better understand our teams to inform which actions to take to support them and make them a happier and more productive bunch. We wanted to get feedback from the ‘coal-faces’ that didn’t rely on second hand conversations and meetings with team leads. As a remote-first software company, using Toyota’s Genba walks is not an option so we need to rely on the data our software systems provide.
To summarize this as a user story, we could say that “As a leadership team, we want a dashboard that provides the insight we need into our engineering teams, so that we know when to take action to help them be happier and more productive.”
Our first iteration of the dashboard suffered from letting engineers loose with a tool and a lot of data. Like children in a sweet shop, we grabbed data from many different sources and created a myriad of metrics and visualizations. While a lot of it looked impressive, very little of it got at the core of the problem we were trying to solve. These days the internet is littered with engineering metrics for various aspects of productivity and performance, such as the four DORA metrics, Scrum, WIP and TOIL metrics – all of which we looked at.
As well as ‘metric-sprawl’, we also struggled with what ‘health’ meant. There is a stigma attached to being considered ‘unhealthy’. In our product, health is represented by a RAG status. Red is a great color to say something has errored or failed, it is not such a great color to label a team with. Beyond this, we also struggled to define metrics that could be used across teams and still make sense. What’s more, there was a question around whether, if we started monitoring a set of metrics and the teams were aware of this, the project may end up driving their behaviour to improve the metrics, rather than providing the insight that would enable us make our teams better.
At this point we changed our approach and decided it would be much better for the teams themselves to define their own status. After all, our teams were already tracking their own work.
We also limited the number of other metrics we used to ones where there was already a department wide understanding. These were:
- Escalations - High impact customer issues, triaged and agreed by team leads and PMs
- Production deployment failures - Any release where our deployment failed or a production pipeline required manual intervention
- Technical debt - Missing automation tests, documentation and legacy code.
Finally, we decided happiness is a pretty good indicator of health, so we started rolling out a regular engineering survey to understand the sentiment amongst our engineers. Our company culture is very important to us and we are proud of being a Great Place to Work.
Our current proposal of a top-level dashboard ended up with these metrics:
- A team defined RAG status (for each team)
- A count of escalations (severity 1 issues)
- Monthly production deployment failures
- Product quality RAG status
- A measure of outstanding tech debt tasks
- An engineering happiness score
Although the focus of the work was on understanding engineering team health, we came across an unexpected (but welcome) side effect. It forced us to reflect on how we behave as leaders and managers in the organization. It challenged our perspectives and approaches, because teams don’t operate in a bubble. To be productive and happy they need a leadership team that is supportive and effective.
Our next step is to bring this engineering team health dashboard to life. I know in advance that we will need to continually improve and iterate on our dashboard. It will require experimentation to find the data we need to make decisions, in a way which both engineers and leaders are ok with, and to present it in a way which is easy to interpret.
We are yet to decide on the specifics of what we will monitor and the notifications we will send out. This could be key to making it part of our management process and stopping it becoming another ‘pretty dashboard’ gathering virtual dust in the corner.
Engineering health dashboard V1
To get the project started, I got together with one of our Directors of Engineering, Josip Dlaka, to see what we could put together. We came up with a V1 that has served as a decent starting point, even though it doesn't quite check all the boxes yet. Check it out here.