Survey results show that AI coding tools reduce productivity by 19%, resulting in a lot of wasted time spent evaluating, revising, and re-outputting AI output

With the evolution of generative AI, there are more and more cases where human work is replaced by AI. Even major technology company Microsoft has revealed
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METR
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
(PDF file) https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf
Not So Fast: AI Coding Tools Can Actually Reduce Productivity
https://secondthoughts.ai/p/ai-coding-slowdown
METR is a non-profit research institute that evaluates the capabilities of AI models that some have criticized as posing 'potentially devastating risks to society.'
METR conducted a rigorous study to measure the productivity benefits of AI tools among experienced developers working on mature projects. The study included 16 developers with moderate experience using AI working on major open source projects.
The developers surveyed were asked to select a coding task from a to-do list of 246 different tasks and predict how long it would take to complete the task. The 246 tasks were randomly assigned to either 'AI enabled' or 'AI disabled,' and the developers were asked to perform the coding task with or without AI. The developers were asked to perform the task while recording their screens, so the time it took to complete the task was recorded accurately.
We measured how much the AI coding tool improved productivity by comparing the developer's predicted time to complete the task with the time it took to complete the task using AI. In tasks where the use of AI was permitted, 84% of screen recordings included the use of at least some AI tool.
The graph below shows the self-reported predictions of 'how productivity has changed by using AI' by developers who use AI tools, and the observed results of the actual survey. The survey results showed that the overall average productivity had decreased by 19%, but economic experts predicted that productivity would increase by about 40%, machine learning experts predicted that productivity would increase by about 40%, and developers who participated in the survey predicted that productivity would increase by about 24% during the survey and about 20% even after the survey, when they actually experienced a decrease in productivity.

Steve Newman, founder of the writing tool Writely, which was acquired by Google, said of the findings: 'Anyone who reports that AI has accelerated their work may be wrong,' and 'These results are incredibly dire.' 'This study doesn't expose AI coding tools as frauds, but it does remind us that AI has important limitations, at least for now.'
In some cases, subjects were instructed to 'use AI to the extent that you think will maximize your productivity,' which may have led some subjects to become too enthusiastic about AI and reduce their productivity. However, the tasks presented to the subjects were almost evenly divided between 'use AI as usual' (70 tasks), 'use AI' (119 tasks), and 'use AI as much as possible' (57 tasks), and only a few tasks forced the use of AI.
In addition, although the subjects' experience with AI varied, we provided them with opportunities to learn how to use AI tools, such as by taking a course on how to use Cursor Pro at the start of the study.
The survey was conducted from February to June 2025, and the subjects were using the latest AI coding tools, such as

By analyzing interviews with developers and screen recordings, the survey has succeeded in identifying several reasons why 'AI reduces productivity?' The biggest problem is that the code generated by AI tools generally does not meet the high standards of open source projects. Developers spend a lot of time reviewing the output of AI, and in some cases have to repeat the same work over and over again, giving additional instructions to the AI, waiting for code generation, discarding the output if there is a fatal flaw, and instructing the AI again. In fact, it is reported that only 39% of the code output by Cursor was used by developers. In addition, this 39% of the code is not used as it is, but is reviewed and reworked by developers.
The graph below shows the percentage of time developers spend on each task when they use AI (green) and when they do not (purple). The vertical axis shows the percentage of time spent on each task, and the horizontal axis shows each task. From the left, the tasks are 'checking AI output,' 'instructions to AI,' 'waiting for AI output,' 'writing code,' 'reading/researching,' 'testing/debugging,' 'Git, environment,' and 'miscellaneous tasks.' By using AI, tasks that do not occur when AI is not used, such as 'checking AI output,' 'instructions to AI,' and 'waiting for AI output,' are created, and it can be seen that developers are forced to spend just over 20% of their total work time on these tasks. On the other hand, the time to actually write code is reduced by about 10%.

Regarding the 19% drop in productivity, Newman said, 'While this may seem depressing at first glance, it applies to a scenario that is difficult for AI tools (experienced developers working on complex code bases with high quality standards) and may be partially explained by developers choosing a more relaxed pace to conserve energy or working more thoroughly with AI. Of course, results will improve over time. ' 'Perhaps most importantly, developers thought they were completing tasks 20% faster, even though using AI meant they were completing them 19% slower. While many assessments of the impact of AI are based on surveys and anecdotal reports, there is hard data in this study showing that such results can be significantly misleading. '
Related Posts:







