STAC-ML™ Markets (Inference) Naive Implementation with ONNX on an Azure D64plds v5 VM (64 Ampere® Altra® vCPUs, 128 GiB memory) Throughput-Optimized Configuration

STAC-ML™ Markets (Inference) Benchmarks (Sumaco suite)

  • STAC-ML Markets (Inference) Naive Implementation (Compatibility Rev B)
  • Driver and Inference Engine
    • Python 3.8.10
    • ONNX runtime 1.12.1
    • NumPy 1.23.3
  • Ubuntu Linux 20.04.5 LTS
    • Based on a standard image provided by Microsoft® Azure
    • No OS tuning performed
  • A Microsoft® Azure Standard D64plds v5 VM
    • 64 Ampere® Altra® vCPUs @ 3.0GHz
    • 128 GiB of memory
    • 256 GiB Premium SSD LRS

Though no vendors had a hand in optimizing the system's performance, one vendor did help make the project happen: Microsoft provided credits in Azure so that this research could be completed. We are grateful for their help.

This report is just one in a series that explores latency and throughput optimization of ML inference workloads across different processor architectures in Microsoft Azure, all under similar software stacks. Together, these STAC Reports illustrate the kinds of insights STAC-ML benchmarks can provide while underscoring the sensitivity of performance results to the objectives of the solution architect.

The full set of reports in this series also includes:

A research note that compares the SUTs, details their performance differences, and explores the latency-throughput-cost trade-offs is available here.

Please log in to see file attachments. If you are not registered, you may register for no charge.

The use of machine learning (ML) to develop models is now commonplace in trading and investment. Whether the business imperative is reducing time to market for new algorithms, improving model quality, or reducing costs, financial firms have to offload major aspects of model development to machines in order to continue competing in the markets.