How to Measure AI
Developer Productivity

Nicole Forsgren

PhD; creator of DORA metrics; engineering leader

OCT 19 2025

The Problem

The AI Productivity
Measurement Gap

"We have great metrics for engineering velocity. We have almost no validated metrics for AI-assisted engineering productivity. That's the gap I'm working on."

DORA metrics measure deployment health, not AI contribution
AI tools change HOW code is written but metrics need to capture WHETHER it changed the outcome
The productivity trap: measuring AI tool usage ≠ measuring AI productivity
What matters: does AI help engineers ship better software faster with fewer defects?

Framework

The DORA + AI Framework

4 DORA

key metrics + AI layer

27%

avg deploy frequency increase with AI

18%

change fail rate reduction

DORA core: deploy frequency, lead time, change fail rate, MTTR
AI layer: AI-assisted code %, review time, spec quality score
Business layer: feature velocity, customer-reported quality, engineer satisfaction
The correlation: teams with best DORA scores adopt AI fastest

Nicole's research findingHigh-performing engineering teams don't just use AI tools — they have eval processes, code review practices, and measurement systems that make AI tool adoption effective.

What The Data Shows

AI + Engineering Performance

Correlation: Teams with strong eng culture adopt AI 3× faster than low-culture teams
Quality: AI-assisted code has 20% fewer defects when engineers review carefully
Speed: Lead time drops 27% on average, more for routine features
Satisfaction: Engineers using AI report 30% higher job satisfaction (less boilerplate)

The culture multiplier

AI tools amplify engineering culture. Strong cultures get better with AI; weak cultures get worse.

The measurement trap

Teams that measure AI adoption without measuring outcomes create activity metrics, not productivity metrics.

Playbook

Measure AI Productivity Right

Baseline your DORA metrics NOW, before AI tools are widely adopted
Add AI usage data to your deployment pipeline — correlate AI usage to DORA outcomes
Survey engineers monthly: does AI make your job better? What would make it more useful?
Look for the unexpected: where does AI HURT metrics? That's your highest-value signal

The research agendaNicole is building validated AI productivity measurement frameworks — the DORA of AI engineering. Watch this space.

Contrarian

Engineering Productivity Myths

✗Lines of code is a proxy for productivityINSTEAD →✓ Lines of code is an anti-metric. AI makes this worse — more code, not necessarily more value.

✗AI tool adoption = productivity improvementINSTEAD →✓ AI tool adoption is a leading indicator. Outcome improvement is the actual metric.

✗Productivity is individualINSTEAD →✓ Engineering productivity is systemic. Team culture, review processes, and deployment systems matter more than individual tools.

✗Evals are only for AI productsINSTEAD →✓ Evals are for any system where output quality varies. AI just makes the need obvious.

0:00

The AI ProductivityMeasurement Gap

The DORA + AI Framework

AI + Engineering Performance

Measure AI Productivity Right

Engineering Productivity Myths

The AI Productivity
Measurement Gap