Improve code quality

Build better quality products, reduce technical debt, and implement development practices that scale as your team grows

Engineering teams often struggle to maintain high software quality. Poor quality leads to customer dissatisfaction, increased operational overhead, and a constant drain on developer time spent fixing issues rather than building new customer value.

While individual teams may operate effectively, broader systemic weaknesses such as gaps in testing, ineffective code review practices, and unreliable code release pipeline can lead to recurring quality problems. Without clear visibility into these quality issues, it can be difficult to take targeted actions that meaningfully improve the situation.

Prerequisites & Setup

Swarmia uses data from your issue tracker to understand which projects you’re working on: configure the Jira integration or the Linear integration.

You'll also need to set up investment categories and define how change failure rate is calculated.

Using Swarmia

A comprehensive approach to the quality problem includes analyzing deployment stability, work allocation, bug management, and development practices to identify quality improvement opportunities.

Start by examining your deployment stability metrics to understand your baseline quality performance.

Change failure rate

The change failure rate tells what percentage of deployments result in failures, degraded service, or require immediate fixes. It's an indicator of your deployment process maturity and code quality.

Look for patterns in when failures occur:

Are failures clustered around certain teams, projects, or time periods?
Do failures correlate with deployment frequency or batch size?
Are there specific types of changes that consistently cause problems?

Mean time to recovery (MTTR)

Mean time to recovery (MTTR) measures how quickly your team can restore service when issues occur. This reflects both your incident response processes and your system's observability. Use this metric to identify:

Teams that struggle with incident response
Systems that are difficult to debug or fix
Process improvements needed in your incident management (issues not being detected quickly)

Bugs

Bug cycle times reveal how effectively your team manages defects. Bug throughput also tells which teams might be having the most work with just maintaining the product. Filter by priority levels (P0, P1, etc.) to understand how critical issues are handled differently from minor bugs. Key questions to explore:

Are high-priority bugs resolved quickly enough?
Are there patterns in what types of bugs take the longest to fix? Do bug cycle times vary significantly between teams or projects?

The proportion of work going into bug fixes versus feature development indicates whether quality issues are consuming your team's capacity. High proportions of reactive work (KTLO) might signal deeper quality problems.

Investment balance

Look at your investment balance over time:

Are you spending more time on bug fixes than planned feature work?
Do certain teams struggle more to keep the lights on?
What kind of patterns can you find when you drill down to the work? Are certain parts of your codebase generating more bugs than others?

Batch size

Smaller pull requests are easier to review thoroughly, leading to better quality. Large batches increase the likelihood that defects slip through code review.

Monitor your pull request batch sizes alongside quality metrics:

Do larger pull requests correlate with higher change failure rates?
Are teams with smaller batch sizes experiencing fewer quality issues?
Is there adequate review time for the changes being made?

Surveys

While metrics tell you part of the story, surveys reveal the underlying practices and team sentiment that drive quality. Developer perceptions of quality practices often predict future metric trends.

Swarmia's developer surveys include the following questions about quality:

Our automated tests catch issues reliably.
Our code reviews adhere to high standards.
Our technical debt is well under control.
Our practices steer toward building secure solutions.
The on-call load in my team is reasonable.

Survey responses can highlight disconnects between metrics and team experience. For example, a team might have low change failure rates but report poor test reliability, suggesting they're catching issues through manual processes rather than automation.

Taking action

Combine survey insights with metrics to identify improvement opportunities. Common good practices include:

Require code reviews for all changes and prefer smaller PRs to ensure quality reviews
Allocate dedicated time for technical debt reduction
Post-incident reviews to learn from failures and share learnings across teams
Implement automated testing that runs on every pull request
Automatic monitoring to detect issues quickly after deployments

PreviousCoach software developers NextGet visibility into your CI pipeline

Last updated 4 months ago

Was this helpful?