The Translation Problem
I sit in board meetings with CTOs regularly. The pattern is almost universal. The CTO presents a slide with DORA metrics, deployment frequency charts, maybe a burndown or velocity trend. The CFO nods politely. The CEO asks a question about a feature that shipped late. The board member from the investment firm asks how engineering headcount compares to revenue growth. The DORA slide gets skipped.
The problem is not that engineering metrics are unimportant. The problem is that most engineering metrics measure the internal mechanics of software delivery, and boards think in business outcomes. Deployment frequency is an internal process metric. Delivery predictability is a business metric. They are related, but they are not the same thing, and the board only has bandwidth for the second one.
This creates a real organizational risk. When the board cannot read engineering performance through the metrics the CTO presents, they default to proxy signals: shipped features they can see, production incidents they hear about, and headcount growth versus revenue growth. These proxies are crude and often misleading. A team that ships fewer visible features but is investing in platform reliability looks unproductive through the proxy lens. The CTO knows the investment is critical. The board sees stagnation.
The fix is not more metrics. It is fewer metrics that directly answer the questions the board is actually asking. And those questions stay remarkably consistent whether you are at a 20-person startup or a public company.
The Three Questions Every Board Asks
Strip away the specific wording and every board-level conversation about engineering reduces to three questions:
Are we delivering what we committed to?
This is delivery predictability. Not speed, predictability. Boards can work with an engineering team that ships on a 12-week cycle if the cycle is reliable. They cannot work with a team that promises 6-week delivery and actually delivers in 14. The metric that answers this: commitment reliability rate, defined as the percentage of quarterly commitments delivered on time and to spec. Target 75-85%. Below 70% means your estimation process is broken or your scope management is undisciplined. Above 90% means you are sandbagging and could be committing to more.
Measure this at the initiative level, not the ticket level. Boards do not care about individual stories. They care about "Did the payment integration ship in Q2 as planned?" Track 8-12 quarterly commitments and report the hit rate. This single number tells the board more about engineering health than any velocity chart.
Is production stable?
This is system reliability, and it is table stakes. The board only hears about reliability when something breaks. Your job is to make reliability visible before incidents happen, so the board understands the investment in keeping systems running. Three metrics: uptime percentage (99.9% is the common bar; below that triggers conversations about infrastructure investment), customer-impacting incident count (trending down = good), and mean time to restore (target under 1 hour for P1 incidents).
Present these as a quarterly trend. A single quarter of 99.95% uptime means nothing. Four consecutive quarters of improving uptime with declining incident count tells a story of engineering maturity. The narrative matters as much as the number.
Are we spending engineering dollars efficiently?
This is investment efficiency, and it is where most CTOs struggle. The CFO has precise cost data. Revenue is tracked to the penny. But engineering output is measured in abstract units that do not translate to dollars. The bridge metric is engineering cost as a percentage of revenue. For SaaS companies, the benchmark is 15-25% of revenue for the full engineering org (including QA, DevOps, management). For earlier-stage companies, the absolute number matters less than the trend: is the ratio improving as revenue scales?
Complement this with cost per feature delivered, calculated quarterly. Take total engineering cost for the quarter, divide by the number of material features shipped. This is crude, but it gives the board a directional signal. If cost per feature is rising quarter-over-quarter with stable headcount, you have a productivity or complexity problem worth investigating. If it is stable or declining, engineering is scaling efficiently.
Leading vs Lagging Indicators
The three board questions above are all answered by lagging indicators: outcomes that have already happened. By the time delivery predictability drops below 70%, you have a problem that started two quarters ago. By the time MTTR spikes, technical debt has already accumulated past the tipping point.
Smart CTOs pair every lagging metric with one or two leading indicators that predict future performance. The board may not care about the leading indicators directly, but they care about the CTO demonstrating that engineering is managed proactively rather than reactively.
| Lagging Metric (Board Sees) | Leading Indicator (CTO Tracks) | Why the Link Matters |
|---|---|---|
| Delivery predictability | Scope change rate per initiative | If scope changes on >30% of committed initiatives mid-quarter, predictability will drop next quarter regardless of team speed |
| Uptime / incident count | Technical debt ratio (% of sprint capacity spent on unplanned maintenance) | When maintenance consumes >25% of capacity, incident frequency rises within 2-3 months |
| MTTR | On-call response time + runbook coverage | Teams with runbooks covering 80%+ of alert types restore 2-3x faster |
| Cost per feature | Cycle time (days from start to production) | Cycle time above 30 days correlates with rising cost per feature due to context-switching, re-planning, and coordination overhead |
| Engineering retention | Developer satisfaction score (quarterly survey) | Teams scoring below 6/10 on satisfaction lose 20-30% of engineers within 12 months |
| Innovation velocity | Percentage of time on new features vs maintenance | Below 40% on new work, teams feel stuck and innovation output drops sharply |
The practical application: present lagging metrics to the board with a brief commentary on what the leading indicators predict for next quarter. "Delivery predictability was 82% this quarter. Scope change rate has increased to 35% on current initiatives, which historically predicts a drop to the mid-70s next quarter unless we tighten scope management." This turns a static report into a forward-looking diagnostic.
Vanity Metrics That Mislead the Board
Vanity metrics are not useless metrics. They are metrics that can be gamed without improving actual engineering effectiveness, and that correlation gap makes them dangerous when presented to audiences that do not understand the gaming dynamics.
Deployment Frequency
"We deploy 47 times per day." Impressive until you learn that each deploy is a micro-commit that moves one config flag. High deployment frequency is a symptom of good CI/CD, which matters. But it is not a measure of engineering output. A team deploying twice a week that ships complete, tested features is often more effective than a team deploying 50 times a day with a 12% rollback rate. Present deployment frequency to the engineering team as an operational metric. Do not present it to the board as a productivity metric.
Story Points Completed
Story points are a team-internal estimation tool, not a productivity measure. They are relative within a team and meaningless across teams. "Team A completed 87 points, Team B completed 52 points" tells you nothing about relative productivity because Team A's 1-point story might be Team B's 3-point story. Worse, presenting points to the board creates pressure to inflate estimates. I have watched teams double their "velocity" in two sprints by simply re-calibrating their pointing scale. Nothing changed except the numbers.
Lines of Code
An engineer who deletes 500 lines and replaces them with 200 lines that do the same thing faster has produced negative LoC and significant value. An engineer who writes 2,000 lines of boilerplate that should have been 400 lines with a library has produced impressive LoC and technical debt. Lines of code measures typing speed, not engineering effectiveness.
Number of Features Shipped
Without quality and impact weighting, feature count rewards breadth over depth. A team that ships 15 small features nobody uses looks more productive than a team that ships 3 features driving 40% of revenue growth. Count features if you want. But weight them by business impact or do not present the count at all.
Test Count
"We have 12,000 automated tests." Great, but how many of them test meaningful behavior? Test suites accumulate trivial tests over time. A suite with 3,000 well-targeted tests covering critical paths provides more protection than 12,000 tests where 8,000 are boilerplate unit tests on getters and setters. Track test coverage on critical paths, not total test count.
Building Your Board Metrics Package
Here is the practical framework. Pick one metric from each of the three board questions, add one or two leading indicators you track internally, and present a one-page quarterly report. Resist the temptation to add more metrics. Boards absorb three to five numbers per topic area. Beyond that, attention diffuses and the narrative gets lost.
Delivery
- Commitment reliability: 82% of Q1 initiatives delivered on time (target: 75-85%)
- Leading signal: Scope change rate at 28%, within healthy range
Reliability
- Uptime: 99.94%, up from 99.91% last quarter
- P1 incidents: 3 (down from 5), MTTR 38 minutes (target: <60 min)
- Leading signal: Tech debt ratio at 22%, healthy but approaching 25% threshold
Efficiency
- Engineering cost / revenue: 19.2% (down from 21.4% last year)
- Cost per feature: $47K average, stable quarter-over-quarter
- Leading signal: Cycle time at 18 days, stable
Present quarter-over-quarter trends. A single quarter snapshot is almost meaningless because it lacks context. Four quarters of data reveals patterns: improving reliability, declining cost efficiency, seasonal delivery variation. The trend is the story. The individual numbers are just punctuation.
Tailoring Metrics by Company Stage
The three board questions remain constant, but the specific metrics shift with company stage.
| Stage | Delivery Focus | Reliability Focus | Efficiency Focus |
|---|---|---|---|
| Seed / Series A | Speed to market (weeks from idea to production) | Basic uptime (99.5% is acceptable) | Burn rate vs feature output |
| Series B-C | Commitment reliability (75-85% hit rate) | 99.9% uptime, MTTR under 1 hour | Eng cost as % of revenue, trending down |
| Growth / Pre-IPO | Predictable quarterly planning cadence | 99.95%+, incident postmortem process | Eng cost per revenue dollar vs public comps |
| Public / Enterprise | Cross-team delivery coordination | SLA compliance, third-party audit trail | R&D capitalization, cost per innovation unit |
Early-stage boards tolerate lower reliability in exchange for shipping speed. Late-stage boards expect reliability as a given and focus on efficiency scaling. The metrics you present should reflect what your board is actually optimizing for at your company's current stage.
What Frameworks Exist Beyond DORA
DORA gave engineering its first widely-adopted measurement framework. But it measures one dimension (software delivery performance) when engineering leaders need visibility across multiple dimensions. Several frameworks have emerged to fill the gaps. Each has strengths and trade-offs.
| Framework | What It Measures | Strength | Limitation |
|---|---|---|---|
| DORA | Software delivery performance (speed + stability) | Well-researched, widely benchmarked, simple to implement | Ignores developer experience, business impact, and AI/ML workflows |
| SPACE | Satisfaction, Performance, Activity, Communication, Efficiency | Multi-dimensional, includes developer wellbeing | Complex to implement, subjective dimensions hard to benchmark |
| DX (DevEx) | Developer experience across feedback loops, cognitive load, flow state | Directly predicts retention and productivity | Survey-dependent, hard to tie to business outcomes |
| Engineering Effectiveness | Full cycle from planning through delivery and maintenance | Covers strategic alignment, not just execution | No standard definition; every company implements differently |
My recommendation: use DORA as a baseline for delivery health, supplement with SPACE or DX surveys for developer experience, and build your own board-facing metrics package around the three questions above. No single framework answers every question, and trying to adopt one wholesale usually means measuring things nobody acts on.
Engineering Metrics Guides
Beyond DORA: Engineering Metrics for 2026
DORA's limitations in the AI era, SPACE and DX frameworks, and practical alternatives for modern engineering teams.
AI Team Metrics
How to measure AI team performance beyond traditional engineering metrics. KPIs for LLM engineering that boards and CTOs actually use.