SUT ID: NVDA250610b
STAC-AI

STAC-AI™ LANG6 on NVIDIA GH200 Grace Hopper Superchip

Type: Unaudited

Specs: STAC-AI™ LANG6

NVIDIA recently performed two STAC-AI™ LANG6 (Inference-Only) benchmark runs using a QuantaGrid S74G-2U server, equipped with the GH200 Grace Hopper Superchip.

Stack under test:

Llama-3.1-70B
STAC-AI Pack for NVIDIA TensorRT-LLM
TensorRT-LLM release v0.17.0
Hardware stack – NVIDIA GH200 Grace Hopper Superchip

This particular report is for the Llama-3.1-70B-Instruct model.

The companion report for Llama-3.1-8B-Instruct can be found here: https://www.STACresearch.com/NVDA250610a

Note: None of the results have been audited by STAC.

Premium subscribers have access to extensive visualizations of all test results, the detailed configuration information for the solutions tested, the code used in this testing, and the ability to run these same benchmarks – as is, or with other models and data sets - in the privacy of their own labs. To learn about subscription options, please contact us.

Please log in to see file attachments. If you are not registered, you may register for no charge.

STAC-AI™ LANG6 on NVIDIA GH200 Grace Hopper Superchip

User login