Deepseek releases AI model 'DeepSeek-R1-0528', released as an open model with performance comparable to O4-mini



Chinese AI company DeepSeek announced on Chinese SNS WeChat that it has released a minor update to its inference AI model DeepSeek R1, ' DeepSeek-R1-0528 '. The Hugging Face repository does not contain a description of the model, only the configuration files and weights, which are the internal components that guide the model's behavior.

deepseek-ai/DeepSeek-R1-0528 · Hugging Face
https://huggingface.co/deepseek-ai/DeepSeek-R1-0528



The minor update 'DeepSeek-R1-0528' has 685 billion parameters, which is slightly heavier. The update mainly improves inference capabilities, with features such as 'deep inference like the Google model,' 'improvements to code generation tasks,' 'a unique inference style that is not only fast but also thoughtful,' and 'long thinking sessions of up to 30 to 60 minutes per task.'



DeepSeek-R1-0528 has already been ranked on LiveCodeBench , which benchmarks a wide range of coding including code generation and repair, code execution, and output prediction. DeepSeek-R1-0528's ranking (August 1, 2024 to May 1, 2025) is fourth at the time of writing, with a score showing performance almost on par with OpenAI's o4-mini (medium).



Below is a video of DeepSeek-R1-0528 reading and summarizing the paper ' Attention Is All You Need, ' which presented the Transformer architecture.



Deepseek-R1-0528 is released under the MIT license, and anyone can obtain the model data for free.

Continued
Speculations are rife that the Chinese-made high-performance AI model 'DeepSeek-R1-0528' may have been distilled using Google's AI 'Gemini' - GIGAZINE



in Software, Posted by log1i_yk