What is 'agent engineering,' a method of developing software with the assistance of a coding AI agent?



While it's no longer unusual to have AI write code, AI tools have recently emerged that can execute the code they write themselves and even make corrections based on the results. This approach of developing software with the support of such AI is called 'agent engineering.' Web developer Simon Willison argues that by utilizing coding AI agents like Claude Code , OpenAI Codex , and Gemini CLI , humans will take on a stronger role in 'deciding what to build,' 'preparing the necessary tools,' and 'verifying the results.'

What is agentic engineering? - Agentic Engineering Patterns - Simon Willison's Weblog
https://simonwillison.net/guides/agentic-engineering-patterns/what-is-agentic-engineering/



◆What is agent engineering?
Wilson explains that agent engineering is a way of thinking about software development that involves not only writing code, but also having the support of a 'coding AI agent' that can execute that code and verify the results.

In this context, 'agent' refers to a system where, when instructions are given to an AI, that AI proceeds with the task, using external tools as needed. In the case of a coding AI agent, the tools include the functionality to execute code.

According to Mr. Wilson, the ability to 'actually run the code' is a crucial element that makes agent engineering possible. If AI could only output code, its uses would be limited, but when AI can run the code itself, observe the results, and make corrections, it becomes much easier to go through the trial and error process towards developing usable software.

Furthermore, Wilson emphasizes that 'writing code' itself is only a part of the software development job, and that what's truly important is determining what kind of problem to solve and how to solve it. Even when AI becomes capable of generating large amounts of code, the role of humans will not disappear. Wilson states that the work of preparing the necessary tools, clearly communicating the problem, verifying the results, and refining them into a reliable form will remain.



◆Writing code is now cheap.
One of the assumptions Wilson cites for agent engineering is the recognition that 'the cost of writing code is now low.' Previously, it wasn't uncommon for developers to spend more than a full day writing hundreds of lines of clean, tested code. Therefore, design, estimation, and prioritization were all built on the assumption that 'the cost of writing code is high.'

However, coding AI agents significantly reduce the effort required for humans to type code on a keyboard. Moreover, by running multiple agents simultaneously, a single developer can handle multiple implementations, code organization, testing, and documentation tasks concurrently.

However, Wilson points out that even if the cost of creating new code approaches zero, the cost of creating good code does not disappear. Wilson's criteria for good code include: 'it works correctly,' 'it can be verified that it works correctly,' 'it properly solves the required problem,' 'its behavior in case of errors is reasonable,' 'it is simple and easy to fix in the future,' 'it has proper testing,' and 'the documentation is up-to-date.' Even if the cost of code generation is reduced by AI, the quality checks and problem definitions required to create good code will not be automated.

Wilson states that, in line with the change that 'the cost of writing code has decreased,' individuals and organizations also need to change their habits. He suggests that even for tasks that were previously put off because they were 'time-consuming,' it would be effective to first let an AI agent try them out.



◆Accumulate the things you know you can do.
As a key principle in agent engineering, Wilson argues, 'Accumulate what you know you can do.' This is because in software development, knowing 'what you can do' and 'how to do it' is a powerful tool.

Furthermore, he says that the knowledge is even more useful if you not only know that it's theoretically possible, but also have the experience of verifying it with working code. Wilson cites blogs, notes where he records what he's learned, GitHub repositories, and collections of HTML tools created with AI as places to accumulate such knowledge.

These can be used not only as personal notes, but also as a collection of examples to give to an AI agent when creating something new. In fact, Wilson explains that one of his favorite methods is to 'combine two or more existing, working things to create something new.'



◆AI should be used to create better code.
A common concern is that letting AI write code will lead to a decline in quality. However, Wilson argues that if the introduction of an AI coding agent actually results in a decrease in quality, it's not because 'it's unavoidable since we used AI,' but rather because there's a problem somewhere in the development process that needs to be fixed.

Mr. Wilson emphasizes the importance of avoiding technical debt from the outset. Technical debt refers to code problems that, if put off, become a major burden later on. Examples include the accumulation of similar processes, the gradual duplication of similar functions, and files becoming too large.

Wilson states that this kind of code organization and improvement is exactly what a coding AI agent is suited for. The cost of improving code has decreased significantly, making it easier to fix even small problems on the spot rather than leaving them unresolved.

Furthermore, Wilson emphasizes the importance of making it easier for AI and coding AI agents to experiment with multiple approaches. This is because code problems can arise not only from errors during implementation, but also from overlooking simpler methods during the design phase or choosing technologies that are not suitable for the function.

What Mr. Wilson particularly emphasizes is the ability to try out many prototypes. For example, to determine whether Redis, a data storage system designed for high-speed processing, is suitable for a system that receives a large volume of access, it's more reliable to first create a prototype and verify it than to decide based solely on discussion. Mr. Wilson says that a coding AI agent allows for the creation of multiple prototypes for this type of verification at a lower cost, making it easier to reduce oversights during the design phase.



◆Anti-patterns to avoid
On the other hand, Wilson also points out the negative uses of AI, the prime example being handing over unverified code generated by AI to a co-developer. Wilson strongly criticizes this, saying that submitting hundreds or thousands of lines of code that you haven't personally reviewed as proposed changes is essentially the same as pushing your own work onto someone else.

Wilson lists the following as good code change proposals: 'being confident that the code works,' 'making small, easily verifiable changes,' 'understanding the purpose of the changes,' and 'even reviewing the explanations written by the AI.' He also suggests that it is effective to include evidence of your own verification, such as notes, screenshots, and videos of your manual testing.

In other words, even if you can generate a large amount of code using an AI agent, the responsibility to review it before showing it to others ultimately rests with you. Wilson argues that passing on code, including the AI-generated explanation, without reviewing it is a waste of the reviewer's time.

in AI,   Software, Posted by log1b_ok