Here are some initial benchmarks of the Grace CPU performance while the Hopper GPU benchmarks will be coming in a follow-up article.

NVIDIA’s GH200 combines the 72-core Grace CPU with H100 Tensor Core GPU and support for up to 480GB of LPDDR5 memory and 96GB of HBM3 or 144GB of HBM3e memory. The Grace CPU employs Arm Neoverse-V2 cores with 1MB of L2 cache per core and 117MB of L3 cache.

On a geo mean basis across all the benchmarks conducted, the GH200 Grace CPU performance nearly matched the Intel Xeon Platinum 8592+ Emerald Rapids processor. The Arm Neoverse-V2 based Grace CPU tended to be much faster than the 128-core Ampere Altra Max AArch64 server.

Overall the NVIDIA GH200 CPU benchmarking was quite fascinating to see its early potential. There still are some workloads not too well optimized for AArch64 and in some cases the higher core counts and dual socket configurations available with Intel Xeon Emerald Rapids and AMD EPYC Genoa(X) / Bergamo could drive the results much higher.