NVIDIA's 'Vera' CPU for AI delivers powerful performance that surpasses AMD's EPYC and Intel's Xeon.

Phoronix, a Linux-focused media outlet, was invited by NVIDIA to conduct benchmark tests on '
NVIDIA Vera CPU Benchmarks: Olympus Cores Delivering The Best Performance Ever Seen On ARM Review - Phoronix
https://www.phoronix.com/review/nvidia-vera-benchmarks
Phoronix was invited by NVIDIA to their Santa Clara headquarters and conducted benchmark tests with the understanding that the results would be made public.
Vera is a CPU designed for AI and HPC (High-Performance Computing). Unlike the previous generation NVIDIA Grace, which used Arm's Neoverse-V2 core, Vera employs NVIDIA's proprietary 'Olympus' core design. Vera features 88 Olympus cores, supports a total of 176 threads with FP8 precision and spatial multithreading, and, combined with LPDDR5X memory, provides a memory bandwidth of up to 1.2TB/s. Compared to Grace, Vera also has twice the L2 cache at 2MB per core, a larger integrated L3 cache of 164MB, and supports PCIe Gen 6 and CXL 3.1 connectivity.
Vera is available through the 'Vera Rubin NVL72' data center rack, which features 72 Rubin GPUs and 36 Vera CPUs, as well as as a standalone CPU rack.
For this benchmark test, Phoronix used a fully configured Vera with 88 cores/176 threads and eight 96GB LPDDR5-9600MT/s memory modules. The peak TDP (Thermal Design Power) was 450W, and the power consumption was approximately 50W or less.
In code compilation benchmark tests, Grace was the slowest among the tested processors, while Vera, in a dual-socket configuration, delivered performance nearly equivalent to AMD's flagship 5.0GHz AMD EPYC 9575F processor. Among the single-socket CPUs tested, Vera was the fastest.

When comparing performance per core during Gem5 compilation, the Vera was positioned between the EPYC 9575F and 9475F.

Node.js's compilation performance was the most impressive, allowing me to compile large codebases in less than half the time it took with Grace.

Vera achieves the highest compilation performance per core in Node.js, tied with the 5.0GHz EPYC 9575F.

In a test compiling the Linux kernel, Vera completed the compilation in just 20 seconds, achieving the fastest time among all the tests conducted.

In an allmodconfig x86_64 kernel build with all modules installed, Vera lagged slightly behind the dual-socket AMD EPYC 9575F and 9755 processors with higher core/thread counts, but it was the fastest among the single-socket solutions tested.

Looking at build performance per core, Vera's Olympus core achieved the fastest build times.

Other results are summarized on the following series of pages.
NVIDIA Vera CPU Benchmarks: Olympus Cores Delivering The Best Performance Ever Seen On ARM Review - Phoronix

NVIDIA stated, 'Tests of the single-socket Vera with Phoronix demonstrated that this CPU delivers outstanding performance under conditions of a 450W TDP and less than 30W of memory power. We also confirmed generational performance improvements across a wide range of workloads, from code compilation, file compression, and video transcoding to Python, Java, and database management—tasks that are typically CPU-intensive and run by agents and AI factories.'
Related Posts:
in Hardware, Posted by log1p_kr







