Aug 18, 2025 19:00:00

Why AI can't actually build software

Many benchmark results have emerged showing that large-scale language models (LLMs), such as ChatGPT, have achieved human-level coding abilities, creating a trend toward using AI in software development. However,

Konrad Irwin , a member of the development team behind the code editor Zed , explains why: 'LLMs can't actually build software.'

Why LLMs Can't Really Build Software — Zed's Blog
https://zed.dev/blog/why-llms-cant-build-software

Irwin has observed over the years how software engineers work that they constantly build

mental models , which are like images of actions in their heads. He found that effective people often repeat the following actions:

Build a mental model of the requirements
Write code that meets the requirements
Build a mental model of what the code actually does
Identify differences and update code or requirements

'The hallmark of a competent software engineer is their ability to build and maintain a clear mental model. LLMs, on the other hand, are not. They're very good at writing code and, to a certain extent, updating it when they identify and fix problems. They can do the things that real software engineers do: read code, run tests, add logs, etc. But what they can't do is maintain a clear mental model,' Irwin said.

Irwin's impression is that 'LLMs are endlessly confusing, assuming that the code they write actually works, and when the tests fail, they have to guess whether they should fix the code or the tests, and then they get frustrated and delete the whole thing and start over.' He argues that this is a major difference from humans, as they are unable to reassess a given context and derive the problem.

Hacker News adds that 'humans can take a step back, look at the bigger picture, and identify the root cause of the problem.'

'Human software engineers test their work as they go. When a test fails, they can examine their mental model and decide whether to fix the code, fix the test, or gather more data before making a decision. When they get frustrated, they can seek help through discussion. And sometimes they'll just delete everything and start over, with a clearer understanding of the problem,' Irwin said.

On the other hand, AI has the following drawbacks:

Models are bad at finding missing context
・The more recently entered information, the more likely it is to be mistaken for correct information.
- Fall into the illusion of false information being asserted as fact

'When the requirements are clear and the problem is simple, it can be done in one go,' Irwin concludes. 'But with more complex tasks, LLMs can't accurately maintain context and can't iterate to present a solution.'

On Hacker News, comments included, ' LLMs are certainly immature, but they have the same abilities as human junior engineers, ' ' They often fail because they are expected to achieve the same results as humans in a limited context, but they will improve if given all the tools that human software engineers use, ' and ' LLMs may not be able to develop software at present, but they have evolved tenfold since ChatGPT was introduced in 2022, so they will likely be able to do so in the future .'

Related Posts:

Aug 18, 2025 19:00:00 in Software, Posted by log1p_kr