The app for independent voices

๐——๐—ผ๐—ฒ๐˜€ ๐—”๐—œ-๐—ฃ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ฒ๐—ฑ ๐—–๐—ผ๐—ฑ๐—ถ๐—ป๐—ด ๐—ง๐—ฟ๐—ฎ๐—ฑ๐—ฒ ๐—ฆ๐—ฝ๐—ฒ๐—ฒ๐—ฑ ๐—ณ๐—ผ๐—ฟ ๐—ง๐—ฒ๐—ฐ๐—ต๐—ป๐—ถ๐—ฐ๐—ฎ๐—น ๐——๐—ฒ๐—ฏ๐˜?

Developers report 10x productivity gains from AI coding agents, yet a Carnegie Mellon study of 806 open-source GitHub repositories found something different.

Researchers compared Cursor-adopting projects against 1,380 matched control repositories, tracking code output and quality monthly using SonarQube.

Here are the key findings:

๐Ÿญ. ๐—ง๐—ต๐—ฒ ๐˜ƒ๐—ฒ๐—น๐—ผ๐—ฐ๐—ถ๐˜๐˜† ๐—ฏ๐—ผ๐—ผ๐˜€๐˜ ๐—ถ๐˜€ ๐—ฟ๐—ฒ๐—ฎ๐—น ๐—ฏ๐˜‚๐˜ ๐—ฑ๐—ถ๐˜€๐—ฎ๐—ฝ๐—ฝ๐—ฒ๐—ฎ๐—ฟ๐˜€ ๐—ณ๐—ฎ๐˜€๐˜

Projects saw a ๐Ÿฎ๐Ÿด๐Ÿญ% ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐—ป ๐—น๐—ถ๐—ป๐—ฒ๐˜€ ๐—ฎ๐—ฑ๐—ฑ๐—ฒ๐—ฑ and a ๐Ÿฑ๐Ÿฑ% ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐—ป ๐—ฐ๐—ผ๐—บ๐—บ๐—ถ๐˜๐˜€ during the first month after Cursor adoption. By month three, both metrics dropped back to pre-Cursor levels. The spike looks great on a dashboard. It just doesn't last.

๐Ÿฎ. ๐—ง๐—ฒ๐—ฐ๐—ต๐—ป๐—ถ๐—ฐ๐—ฎ๐—น ๐—ฑ๐—ฒ๐—ฏ๐˜ ๐—ฎ๐—ฐ๐—ฐ๐˜‚๐—บ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐˜€ ๐—ฎ๐—ป๐—ฑ ๐˜€๐˜๐—ฎ๐˜†๐˜€

Static analysis warnings rose by ๐Ÿฏ๐Ÿฌ% and code complexity increased by ๐Ÿฐ๐Ÿญ% on average. This decline in quality was persistent in the project.

๐Ÿฏ. ๐—ง๐—ต๐—ฎ๐˜ ๐—ฑ๐—ฒ๐—ฏ๐˜ ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜๐—ฒ๐˜€ ๐—ฎ ๐˜€๐—ฒ๐—น๐—ณ-๐—ฟ๐—ฒ๐—ถ๐—ป๐—ณ๐—ผ๐—ฟ๐—ฐ๐—ถ๐—ป๐—ด ๐˜€๐—น๐—ผ๐˜„๐—ฑ๐—ผ๐˜„๐—ป

The researchers found a feedback loop between quality and velocity. A ๐Ÿญ๐Ÿฌ๐Ÿฌ% ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐—ป ๐—ฐ๐—ผ๐—ฑ๐—ฒ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ฒ๐˜…๐—ถ๐˜๐˜† caused a ๐Ÿฒ๐Ÿฐ.๐Ÿฑ% ๐—ฑ๐—ฒ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ in future development velocity. A ๐Ÿญ๐Ÿฌ๐Ÿฌ% ๐—ถ๐—ป๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜€๐—ฒ ๐—ถ๐—ป ๐˜€๐˜๐—ฎ๐˜๐—ถ๐—ฐ ๐—ฎ๐—ป๐—ฎ๐—น๐˜†๐˜€๐—ถ๐˜€ ๐˜„๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด๐˜€ caused a ๐Ÿฑ๐Ÿฌ.๐Ÿฏ% ๐—ฑ๐—ฟ๐—ผ๐—ฝ in lines added. The two-month speed boost generates enough technical debt to drag down productivity for months afterward.

๐Ÿฐ. ๐—”๐—œ ๐˜„๐—ฟ๐—ถ๐˜๐—ฒ๐˜€ ๐—บ๐—ผ๐—ฟ๐—ฒ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ฒ๐˜… ๐—ฐ๐—ผ๐—ฑ๐—ฒ ๐˜๐—ต๐—ฎ๐—ป ๐—ต๐˜‚๐—บ๐—ฎ๐—ป๐˜€

Regardless of the codebase's size, Cursor-adopting projects still had ๐Ÿต% ๐—ต๐—ถ๐—ด๐—ต๐—ฒ๐—ฟ ๐—ฐ๐—ผ๐—ฑ๐—ฒ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ฒ๐˜…๐—ถ๐˜๐˜† than comparable projects producing the same volume of code. This means that such projects are harder to maintain.

QA has to keep up with higher output. We can say that teams adopting agentic coding tools without upgrading their processes are borrowing speed from the future.

The paper even suggests tools should consider "self-throttling," reducing suggestion volume when project complexity crosses healthy thresholds.

๐—Ÿ๐—ถ๐—ป๐—ฒ๐˜€ ๐—ผ๐—ณ ๐—ฐ๐—ผ๐—ฑ๐—ฒ ๐—ฝ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐—ฒ๐—ฑ ๐—ถ๐˜€ ๐—ป๐—ผ๐˜ ๐˜๐—ต๐—ฒ ๐˜€๐—ฎ๐—บ๐—ฒ ๐—ฎ๐˜€ ๐—ฝ๐—ฟ๐—ผ๐—ด๐—ฟ๐—ฒ๐˜€๐˜€ ๐—บ๐—ฎ๐—ฑ๐—ฒ

What processes has your team put in place to manage code quality alongside AI coding tools?

Source: arxiv.org/abs/2511.04427

Apr 6
at
6:51 AM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.