Dieser Blogpost ist auch auf Deutsch verfügbar

Teams that have adopted the paradigm of agentic software development produce more code in less time. Studies by DORA and Faros AI consistently show that these teams complete more tasks and epics in the same time, create more and larger pull requests, and merge more pull requests as well.

This raises the question: what to do with the freed-up capacity? The obvious response would be to simply build more in the same amount of time. There are companies marketing themselves as AI-native that are already optimizing their entire pipeline in exactly this direction: a Slack message in which a stakeholder requests a feature becomes a pull request, which goes through review and gets deployed — all fully automated by various agents.

This is an understandable reflex, for several reasons. First, it sounds appealing to keep existing developers busy rather than shrinking development teams in response to higher productivity. And second, many companies have enormous backlogs that seem to offer an endless supply of work, and stakeholders — especially in senior leadership — always have new ideas or feature requests anyway.

Sometimes the business side still takes too long to prepare their requirements. It becomes the bottleneck. So requirements engineering must be accelerated too. BMAD and similar approaches promise “hours instead of weeks” for specifications. Spec-driven development and AI-assisted requirements engineering are supposed to ensure that developers and their agents never get any slack — always busy, always shipping features to production at maximum speed. What emerges is a perfect output treadmill.

Two problems

But this approach has two problems. First, the data from DORA and Faros AI show not only that more code is being produced, but also that code reviews take significantly longer and that software quality in agentically operating teams declines noticeably. The Faros AI report shows that developers in AI-assisted teams produce a significantly higher number of bugs. Compared to the period when AI adoption was still low, the number of incidents has roughly tripled. The DORA report shows an increase in incidents compared to previous years, including in teams that DORA classifies as high performers.

The other problem is that features per unit of time is not a meaningful metric for determining whether you are on the right track. The interesting questions are these: do these features create value for users and for the business? Do they satisfy concrete user needs? Do they contribute to a desired strategic outcome? If a feature was simply someone’s idea — whether a manager, an executive, or a customer — the answers to these questions might still be yes, but that would be more coincidence than design.

Product discovery as a better answer

Reducing that uncertainty is the goal of product discovery: through systematic methods like opportunity mapping, regular short interviews with (potential) users, solution ideation, and assumption testing. Ideally, there is a continuous cycle of discovery and delivery, which in turn feeds new discovery.

In many organizations, however, systematic discovery simply does not happen. The number of unused features built at great expense because someone had a seemingly brilliant idea is considerable: according to a 2019 Pendo study, 80% of features in average software products are rarely or never used.

So why not respond to the increased productivity in delivery by treating it as an opportunity to finally take the essential work of product discovery seriously? Instead of shipping more features in the same period, most of which not used by any one, we could simply maintain the old pace and use the freed-up time to systematically test whether the assumptions behind a feature even hold. Does the problem we want to solve actually exist? Interviews or observations with potential users provide clarity. Is the feature even a solution to the problem? There too, we want to reduce our uncertainty before moving into implementation.

AI is also changing the economics of product discovery, and it allows the cycle of discovery and delivery to run faster. But some assumptions can still be tested more quickly and cheaply without an agent generating a single line of code, and without deploying anything to production.

That something is shifting dramatically here is also illustrated by the fact that IBM WatsonX is reportedly discussing a ratio of one product manager to half a developer, up from a traditional ratio of one product manager to eight developers, as Melissa Perri describes in a LinkedIn post.

Product management is the new bottleneck. And the answer is not to compress the discovery phase into a few hours using BMAD or similar tools, but to use the space created by faster delivery for more and better discovery work. Fewer features in the pipeline also means fewer pull requests, less review pressure, and fewer incidents. A reversal of the ratio of product managers to developers of this magnitude will have consequences — for team structures, for roles, and for the question of what we as developers actually do.