Engineering metrics

Ondrej Kvasnovsky
9 min readJul 7, 2022

--

Why tracking the engineering metrics

If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
— H. James Harrington

Tracking the metrics is only useful if your team is committed to do continuous improvements. And if you are committed to improving, then your organization and teams need to have a culture of being supportive to expose the vulnerabilities and talk openly about them.

You probably know the quote “Culture eats strategy for breakfast”. No matter what metrics you track, it will make no impact unless your team shares the appropriate culture.

How to monitor the engineering metrics

We need to be able to see the delta between the values and the trend where we are heading. We can achieve that by putting the numbers on a timeline. A simple table where a date column contains the reporting values (within a fixed range) will do the work.

Sample of engineering metrics

If you aim at nothing, you hit nothing.
— Zig Ziglar

Metric #1: Goal progress

There is a great book “Escaping the Build Trap” by Melissa Perri that talks about the importance of organizing teams and going after a goal, rather than implementing features one by one. Here is a video with a summary if you didn’t read the book yet.

The idea of measuring the goal progress is to define the measurable goal and then monitor how are we doing.

The opposite of monitoring progress towards a goal is monitoring how many features we delivered.

To look at it from a different perspective: A delivered feature brings no value to the user. But a fulfilled goal, to make people share 3 pictures a day, for example, brings a lot of value to everyone.

Examples of product goals:

  • Increase user adoption by lowering the user drop-off rate during onboarding to 90%
  • Make people share more content, to 5 shares per user per week.
  • Make teachers produce 1 course every month on average.

I am setting the “product goal” as an engineering metric because it is crucial for the success of the whole team to go towards the same goals!

Metric #2: Organized backlog, clear sprint scope

When working on an agile project for a long time, the tasks might start spanning between sprints. The tasks that have a long lifespan become zombie tasks. A zombie task is a task that is important enough to be part of the sprint, but never as important as the other high-priority tasks that need to be delivered.

To make good use of the agile methodology, it is a good metric to aim to deliver 80% of the scope.

Aiming for delivery of 80% from the committed scope will help with the following.

  • The planning meetings become more deliberate and less vague.
  • The sprint scope will be well defined which will lead to a higher focus of the individuals improving their self-management.
  • The people will know what is expected from them and will have a clear vision of what needs to be finished in the upcoming sprint.
  • It is going to be easier for people to evaluate what has been done which will lead to an increase in satisfaction with the job well done.

Metric #3: Security, license issues

This metric reports the number of critical and high issues that are in the code. If there are critical or high-priority issues found, we need to fix them or mitigate them as soon as possible.

Ideally, all the tools for doing the security and license checks should be part of our CI/CD pipeline. No manual work should be required.

Penetration tests

Every tool for penetration tests will produce a value or security level that we need to agree on. Then we monitor the produced security value (level) periodically.

Usually, there are no issues found, but if they are, they need to be fixed immediately.

Security static code analysis

The scanning tools like SonarQube or Snyk Code scan the code and try to identify insecurities on the code level.

Usually, if we follow good practices for writing secured code, there are many false positives or issues that are mitigated by design and there is nothing to be changed in the code.

Dependencies

We are using snyk.io to scan all our projects and find out the vulnerabilities and license issues in the dependencies we are using.

An example of discovered dependency issues (credit: Snyk.io)

Usually, there are many issues because there are usually many libraries the projects are using. We usually aim to fix critical and high-priority ones.

Metric #4: Code quality

There are two views on the code quality:

  1. Maintainability — tells us if the code is well structured, code dependencies are clear, there is no spaghetti code, no copy-paste code, no extra long functions, and so on.
  2. Code coverage — tells us what parts of the code are tested by automated tests.
The example of test coverage and maintainability metrics, using CodeClimate

Read more: How to report the code coverage for hundreds of code repositories as GPA

Metric #5: Production bugs

It is good to keep an eye on a sorted list of bugs by feature and priority. We can compare, week by week for example, if the bug rate is going up or down.

Usually, the production bugs are about the same, unless something unexpected happened — like a release of a big feature or refactoring.

Metric #6: Production incidents

When a production incident happens, it needs to be communicated to the product team together with an incident document explaining what happen, how it was fixed, and what are we going to do to prevent the incident in the future.

Usually, there are no production incidents.

Metric #7: Cloud cost

Every team needs to keep the cloud cost under control. It is OK to have a high-level metric to keep an eye on the monthly cost per environment. If the cost goes up, we can dig deep, find the root cause and implement the solution.

If you got enough users or enough requests or whatever produces the value, you can calculate the cost of a user per month.

Metric #8: Uptime of backend services

There should be an alerting system connected to the uptime metrics and engineers or DevOps should be notified about failing uptime checks.

Even though we have an alerting connected to the uptime checks, we still want to provide the uptime checks for higher visibility of the whole team. If nothing, we provide more confidence and trust in the system we are building if all the uptime checks are returning 100%.

Uptime checks example (from GCP)

Usually, uptime is never 100% because there can be network issues especially when we are pinging our endpoints from multiple locations around the world.

Metric #9: Backend latency

We usually monitor the 99.9th percentile, or so, to aim for slow requests. If the latency goes unexpectedly up, we zoom into the requests that are part of the 99.9th percentile and investigate the root cause.

Example of backend latencies

Usually, the latency is about the same. If it goes up, it usually means a bigger issue that requires the involvement of engineering.

Metric #10: Percentage of client crashes

The crashes represent the situations when a client unexpectedly failed to operate for the user. For example, a mobile app crashed completely, or a web produced an error in the web console.

Usually, there are always some crashes and it is impossible to get to zero.

Metric #11: Pull request metrics

We are using the CodeClimate Velocity product which is a great tool to measure engineering metrics related to the code changes. We found the following metrics most useful.

Pull request cycle time

Pull request cycle time is the time between when the first commit is authored to when a pull request is merged.

An example of a goal set in the CodeClimate Velocity is to decrease the pull request cycle time

CodeClimate has a good explanation for every metric, here is the one for the cycle time:

Why it matters
Cycle Time represents your team’s time-to-market, or how quickly software is delivered to customers. Low Cycle Time often equates to higher output and more efficient teams. It is also correlated with higher stability, giving your team the ability to quickly respond to issues with change.

This is a success metric that you can use across individuals, teams, or cohorts to make certain that every process modification is improving engineering speed.

How to use it
You can treat Cycle Time as your speedometer. Use it to understand baseline productivity, and check any major change in processes against this to make sure it had a positive (or perhaps non-negative) effect on productivity.

This metric is not diagnostic, so to identify why Cycle Time is low, you’ll want to look at metrics that make up smaller components of the software development process, such as Time to Open, Time to Review, Time to Merge, or Review Cycles.

Benchmarks
Top-performing engineering organizations achieve a Cycle Time of under 2 days.

— CodeClimate

Pull request cycle time is probably the best metric. We needed to understand to what are the weak parts of our pull request flow.

  • Are we slow with opening the pull requests, if yes, why and how to make it faster?
  • Are we slow with reviewing/commenting on opened pull requests, if yes, why and how to make it faster?
  • Does it take a lot of time to merge the approved pull requests, if yes, why and how to make it faster?
  • Is the majority of the pull requests too large, if yes, why and can we make smaller changes instead of big boom changes?
  • Are there pull requests that are very old, if yes, why and how to process them, or should just reject them?

There is a separate metric for every item mentioned above.

PR throughput

There is another metric that makes a lot of sense to monitor, the PR throughput. The PR throughput is a count of how many pull requests are merged over a period of time. It is a nice addition to the previous metric.

Why it matters
Each pull request represents a unit of work that has been perceived as having some value by the engineer who submitted it (e.g., implementing a feature, fixing a bug, or improving a part of the code base). Thus, a total count of merged pull requests can serve as a proxy for value delivered.

How to use it
Over time, this metric signals whether your engineering organization is getting more or less productive. This long-term understanding of output and the direction it’s trending helps you see whether changes to the team are having the expected impact.

Pull Request Throughput can be used as a success metric that communicates progress both within and without the engineering department.

Note that this metric is not diagnostic, so to identify why throughput is low, you’ll want to look at metrics that make up smaller components of the software development process, such as Time to Open, Time to Review, Time to Merge, or Review Cycles.

Benchmarks
The top quartile of engineering organizations merge at least 5 times each week.

— CodeClimate

Few notes about the code change metrics

  1. It is not measuring the performance of individuals, it is measuring how well the team can deliver.
  2. The team should pick if they want to track the code metrics and what they want to focus on.
  3. The metrics can fluctuate a lot, for example, the frontend team has different dynamics than the backend team. It is impossible to compare the teams, the metric has to be used by the team to improve their way of working.
  4. There is no easy way to say what is good and bad, but if a team is trying to be the best performing team, they compare themselves against a benchmark to see if there might an opportunity for improvement.

Other metrics and their priorities

Just a few final words. Every project is different and that is why the priority and set of metrics need to be different for individual projects. There is no “one size fits all” set of engineering metrics that should be followed.

--

--

No responses yet