Jan 31, 2025 11:10:00

Release of 'Mistral Small 3,' a latency-sensitive AI model capable of high-speed inference

Mistral AI, a French AI startup, has released a latency-focused AI model called 'Mistral Small 3' under an open source license. It is possible to create even more powerful models using techniques such as reinforcement learning based on this new model.

Mistral Small 3 | Mistral AI | Frontier AI in your hands

https://mistral.ai/news/mistral-small-3/

The following chart shows the performance of Mistral Small 3, GPT-4o Mini, Gemma-2 27B, and Qwen-2.5 32B, with the vertical axis representing performance and the horizontal axis representing latency. Mistral Small 3, located in the upper left of the chart, demonstrates its ability to generate high-quality answers quickly. By reducing the number of layers, Mistral Small 3 is able to generate 150 tokens per second using four NVIDIA H100 GPUs.

The results of a human evaluation of the Mistral Small 3 and other models, asking which answer was more preferable, are as follows: It was rated higher than the Gemma-2 27B and Qwen-2.5 32B, and was comparable to the GPT-4o mini, demonstrating its ability to compete with larger models like the Llama-3.3 70B.

The results of the Large-Scale Multitask Language Understanding (MMLU) benchmark and the GPQA benchmark, which evaluates advanced inference capabilities, are shown in the figure below. The benchmark results also show that it has performance equivalent to Llama-3.3 70B and GPT-4o mini.

Similar results were seen in other benchmarks such as coding, mathematics, general knowledge, and instruction.

Here's a comparison of the pre-tuned model before fine tuning, which performs on par with larger models like the Llama 3.1 70B.

Its foreign language capabilities are also second to none.

By quantizing Mistral Small 3, it will be possible to run it on a single RTX 4090. It is also said to be effective in tasks that require quick response.

Mistral Small 3 is available for download from platforms such as

Hugging Face and Ollama . It's distributed under the Apache 2.0 license, so commercial use is OK. Smaller and larger models with even more enhanced inference capabilities are planned for release in the future.

Mistral is the only European company with a significant presence in the field of large-scale language models, a type of general-purpose AI system, but it is expected to have only a 5% market share by 2024, losing ground to American companies. The Financial Times reports that Mistral's ability to bounce back from this point is crucial for Europe to maintain its influence in AI.

Related Posts:

Jan 31, 2025 11:10:00 in AI, Software, Web Service, Posted by log1d_ts