Gemini 3 Pro delivers cutting-edge performance in document, spatial, screen, and video understanding



Google DeepMind has published a document stating that the Gemini 3 Pro , which will be released in November 2025, will 'achieve cutting-edge performance by understanding documents, space, screens, and images.'

Gemini 3 Pro: the frontier of vision AI

https://blog.google/technology/developers/gemini-3-pro-vision/

Rohan Doshi, product manager at Google DeepMind, described Gemini 3 Pro as 'our most powerful multimodal model to date, delivering cutting-edge performance across document understanding, spatial understanding, screen understanding, and video understanding.'

First, in the area of 'document understanding,' Gemini 3 Pro has been shown to reconstruct illegible handwritten text, nested table structures, complex mathematical notation, and non-linear layouts into structured code in HTML, LaTeX, and Markdown.

A reconstruction of a handbook left behind by an 18th century merchant.



Reading handwritten mathematical formulas.



This is the polar area chart drawn by Florence Nightingale.



In the area of 'spatial understanding,' it is said to be able to identify objects and their intentions.



A demo video has been released for the 'screen understanding' section, showing that Gemini 3 Pro understands the UI on the PC screen.

Gemini 3 Pro: Screen Understanding Demo - YouTube


Doshi said the Gemini 3 Pro in particular is a 'leap forward' in 'video understanding.' Processing video at 10 FPS allows it to analyze the mechanics of golf and tennis swings, among other things. And with video inference in 'thinking' mode, it can not only identify what's happening, but also understand 'why it's happening.'


The Gemini 3 Pro reportedly achieved a score of 54% in the ARC-AGI-2 benchmark, which measures the abstract reasoning capabilities of AI models. Its cost per task is $31 (approximately ¥4,800), which is higher than other AI models, but it delivers overwhelmingly high performance. OpenAI's GPT-5, which has a large market share as an AI model, achieved a score of just under $1 (approximately ¥156), a low cost.



in AI,   Video, Posted by logc_nt