Dec 18, 2025 11:13:00

Google releases Gemini 3 Flash, which maintains the advanced inference capabilities of Gemini 3 Pro while being faster and about a quarter the price

Google released the latest model in the Gemini 3 model family, Gemini 3 Flash , on December 18, 2025, which emphasizes speed and efficiency. The greatest feature of this model is that it maintains the advanced inference capabilities of

the Gemini 3 Pro while achieving the low latency and low cost that are characteristic of the Flash series.

Introducing Gemini 3 Flash: Benchmarks, global availability
https://blog.google/products/gemini/gemini-3-flash/

AI Mode update: Gemini 3 Flash, Nano Banana Pro
https://blog.google/products/search/google-ai-mode-update-gemini-3-flash/

Build with Gemini 3 Flash: frontier intelligence that scales with you
https://blog.google/technology/developers/build-with-gemini-3-flash/

Google DeepMind has released a movie comparing SVG image generation and web application coding using Gemini 3 Flash (left) and the previous generation model, Gemini 2.5 Flash (right). Looking at this, we can see that Gemini 3 Flash has clearly improved speed and accuracy from the previous generation.

Watch how Gemini 3 Flash compares to 2.5 Pro when generating SVG code and simple web apps. ↓ pic.twitter.com/4C2Nino7Eg
— Google DeepMind (@GoogleDeepMind) December 17, 2025

Below, we give the same prompts to Gemini 3 Flash (left) and Gemini 3 Pro (right) to code animations, 3D models, and weather apps. In each case, Gemini 3 Flash completes the tasks faster.

Only 29 days after Pro: Gemini 3 Flash is just as smart (1477 LMSYS Elo), 4x cheaper, and sooo much faster! ⚡ pic.twitter.com/lOTpMGIomn
— Ali Eslami (@arkitus) December 17, 2025

Gemini 3 Flash features a dynamic thinking function that adjusts the thinking time according to the complexity of the processing, allowing users to control the balance between quality, cost, and latency in four stages using a parameter called 'thinking_level.' In addition, even in multimodal processing, users can select the media resolution for image and video analysis, supporting tasks that require reading fine text or detailed identification. While Gemini 3 Flash does not have image generation capabilities on its own, image generation is possible by using Gemini 3 Pro Images (Nano Banana Pro).

Gemini 3 Flash also demonstrated superior performance in benchmark results, scoring 90.4% on

GPQA Diamond , which measures PhD-level reasoning ability, and 81.2% on MMMU Pro, which measures multimodal understanding, on par with Gemini 3 Pro and surpassing GPT-5.2 's 79.5%.

Gemini 3 Flash is rolling out to developers now ⚡

3 Flash is our latest model with frontier intelligence built for speed. With strong multimodal, coding and agentic features, it not only enables everyday tasks with improved reasoning, but also is our most impressive model for… pic.twitter.com/woWX0ZFFys
— Google (@Google) December 17, 2025

In particular, on the Humanity's Last Exam (HLE), which tests highly specialized knowledge, it achieved a score of 33.7% without using any tools, nearly three times higher than the previous model and approaching the 34.5% achieved by GPT-5.2. On the SWE-bench Verified test, which evaluates coding ability, it achieved a score of 78.0%, surpassing the 77.2% achieved by the Gemini 2.5 series and Claude 4.5, and even the 76.2% achieved by Gemini 3 Pro. Furthermore, on the SimpleQA Verified test , which measures general knowledge accuracy, it achieved a score of 68.7%, significantly surpassing the 38.0% achieved by GPT-5.2 and the 29.3% achieved by Claude 4.5, demonstrating a dramatic improvement in answer accuracy.

In terms of safety and efficiency, the platform is trained using

JAX and ML Pathways on Google's proprietary TPUs, ensuring sustainable operation. Compared to Gemini 2.5 Flash, the platform boasts improved objectivity in safety assessment and tone, and a 10.4% reduction in unwarranted prompt rejections.

Gemini 3 Flash supports an input context window of 1 million tokens and an output of up to 64,000 tokens, and API pricing is set at $0.50 (approximately 77.7 yen) per million input tokens and $3.00 (approximately 466 yen) per million output tokens. This is an increase compared to Gemini 2.5 Flash's $0.30 (approximately 46.6 yen) input/$2.50 (approximately 389 yen) output, but it is significantly cheaper than Gemini 3 Pro's $2.00 (approximately 311 yen) input/$12.00 (approximately 1870 yen) output.

The graph below visualizes the balance between performance (horizontal axis) and cost (vertical axis), showing that Gemini 3 Flash maintains performance exceeding that of Gemini 2.5 Pro while simultaneously achieving three times the response speed and overwhelmingly lower cost. Google claims that by adding the intelligence of Gemini 3 Pro to the low latency strength of the previous Flash series, it has dramatically improved the trade-off between quality and efficiency, establishing a new industry standard.

At the time of writing, Gemini 3 Flash is available in preview through Google AI Studio, the AI mode in Google Search, and the Gemini app. Regular users can use it, including the free plan. Paid subscription users of the Pro and Ultra plans are subject to significantly higher usage limits than free users.

Developers can also preview Gemini through Google AI Studio's Gemini API, Gemini CLI, Android Studio, and the new agent development platform, Google Antigravity.

Related Posts:

Dec 18, 2025 11:13:00 in AI, Video, Software, Web Service, Web Application, Posted by log1i_yk