Alibaba releases new open-source AI model 'Qwen2.5-VL-32B,' with improved image analysis and mathematics capabilities

Qwen, Alibaba Cloud's AI research team, has released a new visual language model, ' Qwen2.5-VL-32B ', based on the 'Qwen2.5 VL' series of visual language models released in January 2025. The accuracy of image analysis and content recognition has improved, improving the quality of answers.
Qwen2.5-VL-32B: Smarter and Lighter | Qwen
https://qwenlm.github.io/blog/qwen2.5-vl-32b/

The Qwen2.5 VL, released in January 2025, has three models with different parameter sizes: 3B, 7B, and 72B. The largest 72B model has performance that exceeds that of GPT-4o and Gemini 2.0 Flash.
Alibaba's AI research team releases 'Qwen2.5 VL', a visual language model that can recognize and automatically operate the UI of PCs and smartphones, and can automatically perform airline ticket reservations with performance exceeding GPT-4o - GIGAZINE

This time, the Qwen team created the Qwen2.5-VL-32B model, which has enhanced various capabilities by optimizing the model using reinforcement learning based on the Qwen2.5 VL model. Below is a comparison of benchmarks measuring multimodal performance such as images. The Qwen2.5-VL-32B model shown in red has better results than models with the same number of parameters, such as Mistral Small 3.1-24B and Gemma 3-27B-IT, and has won in many indicators against the Qwen2-VL-72B model, which has more than twice the number of parameters.

The benchmark results for pure text are as follows: It outperformed models with comparable parameters in many benchmarks.

The Qwen team's blog also shows how the Qwen2.5-VL-32B actually solves the problem. Below is an image of the speed limit, along with the prompt, 'You are driving a large truck and it is 12 o'clock now. Can you arrive 110 km away by 13:00?'

The Qwen2.5-VL-32B was able to read the truck's speed limit from the image and correctly answer 'No.'

Below is an example of finding the general formula for the area of a square when you connect the third division points of each side of a square to make a square.

This was also answered correctly. The Qwen2.5-VL-32B has been improved in math skills since the Qwen2.5 series.

Qwen2.5-VL-32B is
released
under the open source Apache license 2.0 , and can be used for free and for commercial use. The code for using the Qwen2.5 series is also released under the Apache license 2.0, so if you're interested, please check it out.Related Posts:
in Software, Posted by log1d_ts