The updated DeepSeek-V3 'DeepSeek-V3-0324' is faster in all tests, and some say it has become the 'best non-inference model.'

Chinese AI company
deepseek-ai/DeepSeek-V3-0324
https://simonwillison.net/2025/Mar/24/deepseek/

DeepSeek's New 641GB AI Model Lands Quietly — and Runs Surprisingly Fast on a Mac - WinBuzzer
DeepSeek-V3 was announced in December 2024 as the largest large-scale language model ever, with 671 billion parameters.
Chinese AI company DeepSeek releases AI model 'DeepSeek-V3' comparable to GPT-4o, with a threatening 671 billion parameters - GIGAZINE

DeepSeek-V3 was a custom license, but the newly released DeepSeek-V3-0324 is released under the MIT license, an open source software license, and has a total file size of 641GB. Although it is not a new model, it has built-in FP8 quantization support. Also, although the number of parameters is 685 billion, only about 37 billion of them are active during inference, which relaxes the hardware requirements.
mlx-community/DeepSeek-V3-0324-4bit · Hugging Face
https://huggingface.co/mlx-community/DeepSeek-V3-0324-4bit
Machine learning researcher Auni Hanun ran a quantized version on a Mac Studio equipped with an Apple M3 Ultra chip and 512GB of unified memory and observed inference speeds of over 20 tokens per second.
The new Deep Seek V3 0324 in 4-bit runs at > 20 toks/sec on a 512GB M3 Ultra with mlx-lm! pic.twitter.com/wFVrFCxGS6
— Awni Hannun (@awnihannun) March 24, 2025
Xeophon, an AI researcher, also ran a benchmark and reported that the score jumped out over DeepSeek-V3 in all tests, surpassing Anthropic's Claude 3.5 Sonnet and commenting that it was 'the best non-inference model.'
Tested the new DeepSeek V3 on my internal bench and it has a huge jump in all metrics on all tests.
— Xeophon (@TheXeophon) March 24, 2025
It is now the best non-reasoning model, dethroning Sonnet 3.5.
Congrats @deepseek_ai ! pic.twitter.com/efEu2FQSBe
News site WinBuzzer notes that China has restrictions on chip imports from the US, making lightweight and efficient architectures like DeepSeek's valuable.
Related Posts:
in Software, Posted by logc_nt