Anthropic suddenly releases Claude 3.5 Sonnet, benchmark results rival GPT-4o



Anthropic, the developer of the chatbot AI 'Claude,' announced its new model, ' Claude 3.5 Sonnet ,' on June 21, 2024. This is the first model in the upcoming Claude 3.5 family, and its benchmark results are said to be comparable to OpenAI's

GPT-4o .

Introducing Claude 3.5 Sonnet \ Anthropic
https://www.anthropic.com/news/claude-3-5-sonnet

According to Anthropic, Claude 3.5 Sonnet has graduate-level reasoning skills, undergraduate-level knowledge, and coding skills. Compared to previous Claude models like Claude 3 Opus, Anthropic claims that the Claude 3.5 Sonnet has significantly improved abilities to understand nuance, humor, and complex instructions, and is also better able to write high-quality content in a natural, approachable tone.



Claude 3.5 Sonnet offers improved performance and cost efficiency compared to Claude 3 Opus, with Claude 3.5 Sonnet running twice as fast as Claude 3 Opus. In an agent coding evaluation, Claude 3.5 Sonnet solved 64% of the problems compared to 38% of the problems solved by Claude 3 Opus.

According to the benchmark results published by Anthropic, Claude 3.5 Sonnet achieved results equal to or better than GPT-4o in five of the eight categories: inference (

GPQA ), knowledge ( MMLU ), coding ( HumanEval ), math problem translation ( MGSM ), and text inference ( DROP ).



And Anthropic said, 'Claude 3.5 Sonnet is the most powerful vision model to date, surpassing Claude 3 Opus in standard vision benchmarks,' and particularly improved performance in tasks requiring visual reasoning, such as interpreting charts and graphs, and even accurately transcribing text from incomplete images. Anthropic has released a movie in which Claude 3.5 Sonnet actually performs visual tasks.

Claude 3.5 Sonnet for vision - YouTube


Below is a table comparing the results of benchmarks on visual inference with Claude 3 Opus, GPT-4o, and Gemini 1.5 Pro.



In addition, Anthropic announced that it has implemented a new feature called 'Artifacts' in Claude.ai. Artifacts is a feature that displays content in a dedicated window rather than in the answer when you ask Claude to generate content such as code, text documents, or website designs. For more information on Artifacts, please see the following movie.

Claude 3.5 Sonnet for sparking creativity - YouTube


Regarding security and privacy, Antropic commissioned the UK Artificial Intelligence Security Institute (UK AISI) to evaluate the security of Claude 3.5 Sonnet and made repeated improvements before launching it. Furthermore, Antropic claims that by integrating policy feedback from external experts, Claude 3.5 Sonnet can address a variety of abuses.

Claude 3.5 Sonnet is available for free on Claude.ai and the Claude iOS app . Paid Claude Pro and Team subscribers can access higher rate limits. It's also available through Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI, with pricing starting at $3 per million input tokens and $15 per million output tokens.

Anthropic plans to continue improving the intelligence, speed, and cost of the Claude 3.5 model, releasing Claude 3.5 Haiku and Claude 3.5 Opus in late 2024.

in AI,   Video,   Software,   Web Service, Posted by log1i_yk