NVIDIA's Blackwell Ultra (GB300 NVL72) is 50 times faster than the H200 for AI processing, at 1/35 the cost

NVIDIA's high-performance GPUs are essential for AI development and product deployment. A recent article on the official NVIDIA blog highlighted the GB300 NVL72, NVIDIA's most advanced GPU at the time of writing, claiming it offers 50x the processing power of the H200 at 1/35 the cost.
New Data Shows NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Costs for Agentic AI | NVIDIA Blog
The GB300 NVL72 is an AI processing system sold as a rack for AI factories, equipped with 72 NVIDIA Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs. It has already been adopted by AI cloud companies such as Microsoft and CoreWeave, and is being used for tasks such as agent coding that handle large amounts of context.

The GB300 NVL72 inherits the kernel optimizations and other achievements accumulated in the previous generation model, the GB200 NVL72, and maximizes computing power. It also features high performance and low cost, allowing it to perform processing at a lower cost than the GB200 NVL72.

The GB300 NVL72 is also capable of running models quantized to FP4, enabling significantly faster processing than Hopper-generation GPUs that do not support FP4. Comparing a scenario running the FP8 version of DeepSeek-R1 on an H200 with a scenario running the NVFP4 version of DeepSeek-R1 on a GB300 NVL72, the GB300 NVL72 can process 50 times more tokens.

What's more, it costs 35 times less than H200.

NVIDIA plans to begin shipping GPUs based on its next-generation architecture, Rubin, in the second half of 2026. Rubin-generation GPUs will deliver up to 10 times the throughput per 100 megawatts compared to Blackwell-generation GPUs, while reducing the cost per million tokens by one-tenth. Furthermore, the company claims that the number of GPUs required for training MoE models will be reduced by one-quarter compared to Blackwell-generation GPUs.
Related Posts:







