Mean time to recovery represents the average time between a failing deployment and a deployment that is marked as a fix.
Summary
Mean time to recovery (MTTR) is one of the DORA metrics, key to understanding your deployment health. This specific metric helps teams understand how quickly they're able to resolve issues.
In Swarmia, a fixing deployment is a deployment that has "fixesVersion" attribute specified.
Example
If a deployment failed at 12pm on a Tuesday, and then a fix was deployed at 5pm that same day, the time to recover would be 5 hours.
Mean time to recovery looks at all the detected TTRs in a give time frame and calculates the average.
Why it matters
Low MTTR is key for an excellent customer experience, because it indicates minimal downtime or other critical issues in the product.
How to use it
Time to recovery (TTR) can be determined for each failure as the time between the original deploy and the fix for the problem. TTR can be used to understand the impact of each change failure (how long did the problem last, and what was its impact to the customer).
Where to find it
You can find mean time to recovery (as well as the other DORA metrics) under Metrics → DORA.