Vault Report: STAC-AI™ LANG6 on NVIDIA GB200 Grace Blackwell

NVIDIA publish unaudited STAC-AI inferencing benchmark results for GB200

30 July 2025

NVIDIA recently performed two STAC-AI™ LANG6 (Inference-Only) benchmarks runs on a GB200 Grace Blackwell system.

The Stack Under Test (SUT) was a Nebius 4xGB200 VM server featuring NVIDIA GB200 NVL72 Grace Hopper Superchip. The specific VM is equipped with a single node off the NVL72 system, having 4xGB200 GPUs and 2 Grace CPUs. Two separate tests were performed for the Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct models.

Note STAC has not audited these reports and NVIDIA is solely responsible for these results.

The EDGAR4a/b Data Sets mentioned involved in the benchmark model a Retrieval Augmented Generation (RAG) workload based on EDGAR securities filings, having a median initial context size of approximately 1,200 words. The EDGAR5a Data Set represents question-answering against an entire EDGAR 10-K filing with a median initial context size of 44,000 words.

https://www.STACresearch.com/NVDA250714a (Llama-3.1-8B-Instruct)
https://www.STACresearch.com/NVDA250714b (Llama-3.1-70B-Instruct)

Premium subscribers have access to extensive visualizations of all test results, the detailed configuration information for the solutions tested, the code used in this testing, and the ability to run these same benchmarks – as is, or with other models and data sets - in the privacy of their own labs. To learn about subscription options, please contact us.

About STAC News

Read the latest about research, events, and other important news from STAC.

More News

Vault Report: STAC-AI™ LANG6 on NVIDIA GH200 Grace Hopper Superchip

Vault Report: STAC-A2 Risk Computation on 2x Intel 6980P Processors with RDIMMs

STAC Report: STAC-A2 Pack for oneAPI (Rev R) with 2 x Intel Xeon 6980P Processors, Micron MRDIMMs and Red Hat Enterprise Linux 9.5

Research Note: Comparing LLM Benchmarking Frameworks

STAC Research Note: Performance And Efficiency Comparison Between Self-Hosted LLMs And API Services

You are here

Vault Report: STAC-AI™ LANG6 on NVIDIA GB200 Grace Blackwell

About STAC News

Subscribe to notifications of research, events, and more.

More News