Minimal illustration of AI metrics becoming a target

Article · July 2026

When AI Metrics Become Targets

Do companies track the adoption of AI tools within their engineering teams, and assess their impact on productivity? The answer is clearly yes. An entirely new category of engineering and management metrics has already emerged around AI adoption and AI-assisted software development.

In 2026, Google stated that 75% of its new code is AI-generated. Other examples include Airbnb and Meesho, reporting approximately 60% and 70% respectively. Microsoft's GitHub Copilot provides dedicated metrics around usage, acceptance rates, active users, AI-assisted commits and AI-generated code shares, making it increasingly easy for organizations to analyze both the adoption and the perceived impact of AI-supported development.

This immediately raises a number of important questions:

Is “more AI-generated code” actually a meaningful goal?
Will teams start optimizing for quantity instead of quality?
Could teams inadvertently optimize for code volume rather than business outcomes?
How should ownership be measured in AI-assisted development?
What happens to technical debt?
Does this metric encourage the behaviors we actually want?

And these questions are no longer theoretical. The industry is already debating the limitations and risks of AI-related engineering metrics. There is growing criticism of AI-generated lines of code as a vanity metric, as well as concerns about purely output-oriented measurements such as “AI percentage shipped.”

Engineering organizations have traditionally used metrics such as story points, velocity, deployment frequency, and incident counts to understand delivery performance and operational stability. Now, a new layer of AI-related metrics is emerging: percentages of AI-generated code, AI-assisted code acceptance rates, AI-assisted pull requests and agent-generated changes.

What is the underlying motivation behind these metrics in the first place?

Organizations expect AI to increase productivity, reduce development time, lower costs, increase output, reduce bottlenecks, and accelerate innovation. Especially in software development, AI promises faster implementation, shorter feedback cycles and broader access to knowledge and capabilities. As organizations begin to invest heavily in AI-supported development, they also start asking familiar management questions: How much is AI actually being used? Which teams benefit most? Where are measurable productivity gains visible? And how can the impact of these investments be demonstrated?

As a result, companies almost inevitably measure usage, track adoption, introduce KPIs and reporting structures around AI-supported development.

From an organizational perspective, this development is almost unavoidable. Organizations rarely introduce strategic technologies without simultaneously creating mechanisms to monitor, evaluate and steer their adoption. What begins as a technological innovation therefore quickly becomes part of the organization's broader steering and governance logic.

Why are these metrics so attractive?

From an organizational perspective, AI-related engineering metrics are highly attractive because they are simple, visible and easy to operationalize. Many of these metrics can be introduced quickly without requiring fundamental changes to existing reporting structures or management processes. Publishing AI adoption figures also sends a strong signal to the market. High percentages of AI-generated code communicate technological progress, innovation capability and organizational modernity. AI metrics therefore serve not only internal steering purposes, but also external positioning and signaling.

At the same time, introducing AI-related KPIs can be pragmatically useful during periods of organizational transformation. Many companies are currently trying to change engineering behavior quickly, encourage experimentation and accelerate learning around AI-supported development. Temporarily measuring adoption can help create organizational focus and push teams to actively engage with new tools and workflows. There is also a more practical reason why these metrics emerge so quickly: measuring outcomes is significantly harder. It is difficult to directly quantify better architectural decisions, improved collaboration, reduced cognitive load, higher software quality or long-term maintainability. By contrast, AI adoption metrics are comparatively easy to introduce. Organizations can quickly measure tool activation, acceptance rates, AI-assisted commits or percentages of AI-generated code. This simplicity makes such metrics highly appealing for management reporting and organizational steering.

What is the underlying challenge?

Organizations often begin by measuring AI usage, even though what they actually want are better outcomes. This is an important distinction. AI adoption is measured because direct outcome measurement is significantly more difficult. It is hard to quantify better architectural decisions, improved user experience, reduced complexity or faster organizational learning. By contrast, it is relatively easy to measure whether AI coding assistants are enabled, how many suggestions are accepted or how much code was AI-generated.

Once organizations start measuring AI-generated output, developers will inevitably begin optimizing for the metric itself. As soon as a metric becomes visible, comparable and relevant for organizational steering, people naturally start aligning their behavior around it.

Imagine a company introducing the following metric:

AI-generated code per team %

The moment such a metric appears, implicit organizational signals emerge:

high numbers appear modern and progressive,
low numbers appear outdated,
management starts paying attention,
teams begin comparing themselves,
leaders start reporting adoption figures upward.

And once these signals emerge, behavior starts to change. Developers may begin to:

accept more AI-generated suggestions,
use AI tools to generate smaller or unnecessary changes,
apply AI even in situations where it adds little real value,
produce more code instead of less code.

What started as a way to measure engineering performance gradually becomes a mechanism that shapes engineering behavior.

From Measurement to Incentives

The emerging discussion around AI-related KPIs is a strong example of how metrics reshape behavior and incentives. AI is not only changing software development itself. It is also changing measurement systems, organizational expectations and even the definition of productivity. Once organizations begin measuring AI adoption and AI-generated output, these metrics inevitably start influencing priorities, decisions and behavior.

At its core, this reflects Goodhart's Law:

“When a measure becomes a target, it ceases to be a good measure.”

A Better Approach to AI Metrics

Organizations should therefore treat AI-related metrics carefully. AI adoption metrics can provide useful signals, but they should never become isolated performance targets for individual developers or teams. Instead of optimizing for AI-generated output itself, organizations should focus on whether AI actually improves meaningful engineering and business outcomes.

This may include:

delivering customer value faster,
reducing unnecessary coordination and operational friction,
improving system reliability and maintainability,
shortening feedback and delivery cycles,
and enabling teams to work more sustainably and effectively over time.

AI metrics should be interpreted as contextual indicators rather than direct measures of engineering performance. A high percentage of AI-generated code does not automatically indicate better software development, just as a low percentage does not necessarily indicate inefficiency. The central question is not how much code AI generated, but whether its use helped the organization achieve better outcomes. Organizations that fail to make this distinction risk optimizing for visible activity instead of meaningful value creation.