Anthropic details how they built their multi-agent Claude Research system, which significantly improved internal evaluations compared to single-agent systems

Anthropic's AI model '
How we built our multi-agent research system \ Anthropic
https://www.anthropic.com/engineering/built-multi-agent-research-system

Anthropic: How we built our multi-agent research system
https://simonwillison.net/2025/Jun/14/multi-agent-research-system/
In April 2025, Anthropic announced that it had introduced a new feature called 'Research' to its chat AI 'Claude,' which performs detailed research and analysis according to user instructions. Claude can now be integrated with Google Workspace, including Gmail and Google Calendar, allowing Claude to search users' emails, check documents, and check calendar events for more personalized searches.
Claude adds 'Research' function, making it possible to infer not only data on the web but also the contents of Gmail and Google Calendar - GIGAZINE

While a typical AI model would be able to produce useful results with just one prompt handled by a single agent, Claude combines multiple agents to run prompts in parallel, making the task more complex than a single agent could handle.
Anthropic explains why: 'The essence of search is compression, that is, extracting insights from a huge database. Subagents work in parallel to facilitate compression, searching different aspects of a question simultaneously and condensing the most important tokens.' For example, when processing the task of 'identifying all directors of an IT company that meet certain criteria,' a single-agent system would use a 'sequential search' that searches for data that matches the criteria in order from the beginning, which is slow and unable to find the answer. However, Anthropic revealed in-house research evaluation that the multi-agent system was able to find the correct answer by breaking down the task into tasks for the subagents.
Specifically, the multi-agent system announced in May 2025, with Claude Opus 4 of the ' Claude 4 ' family as the lead agent and Claude Sonnet 4 as the sub-agent, showed performance that was 90.2% higher than the single-agent Claude Opus 4 in an internal research evaluation.
Anthropic goes on to detail the rapid engineering process required to build a truly effective system: Below is how a typical multi-agent architecture works: When Claude types in a search query, the request is sent to the multi-agent system, where the lead agent and subagents interact to process the results and return them to the user.

Anthropic is not a typical multi-agent system, but uses multi-stage search to dynamically discover relevant information, adapt to new discoveries, and analyze the results to generate high-quality answers. Below is a diagram of the workflow of the multi-agent system that Anthropic builds, showing how when a user submits a query, the system creates a 'LeadResearcher agent' that starts the iterative research process, and then creates dedicated subagents with specific research tasks.

Early multi-agent systems had errors such as spawning 50 subagents for a simple query, or agents confusing each other with excessive updates. In Anthropic's system, the lead agent breaks down the task by explaining the query to the subagents, and then the appropriate agent combination is achieved.
Multi-agent systems work because they distribute enough tokens among each agent to solve the problem, effectively allowing them to spend tokens on tasks that exceed the limits of a single agent. However, these architectures have the drawback of burning tokens quickly, even in the relatively high-efficiency Anthropic system. In Anthropic's data, agents use about four times as many tokens as chatbot interactions, but the multi-agent system uses about 15 times as many tokens as the chatbot. Therefore, to be economically feasible, multi-agent systems need to be assigned tasks that are worth justifying their high performance.
'Despite the challenges, multi-agent systems have proven valuable in open-ended research tasks. In fact, Claude has helped people discover business opportunities they had never considered before, explore complex medical options, solve pesky technical bugs, and save them days of work by uncovering research connections they would not have found on their own. We are already seeing these systems transform how we solve complex problems,' said Anthropic.
Related Posts:
in Software, Posted by log1e_dh