In the second quarter of 2014, Intel has submitted several systems to be tested using STAC-A2 Benchmarks. This page provides a guide to those results.
(At the STAC Summit in New York, Intel will explain how they re-designed their algorithm implementation to achieve the results below.)
STAC-A2 is the user-developed benchmark standard based on financial market risk analysis. Developed by quants and technologists from some of the world's largest banks, STAC-A2 reports the performance, scaling, quality, and resource-efficiency of any technology stack that is able to handle the workload (Monte Carlo estimation of Heston-based Greeks for a path-dependent, multi-asset option with early exercise).
Those of you who follow STAC-A2 will know that over the previous several quarters, we have used it to test several configurations involving CPUs, GPUs, and Intel's Xeon Phi co-processor (some results are public, and some are in the STAC Vault). In Q2 of this year, Intel recently re-optimized its benchmark implementation a couple of times (qualified STAC members can access these code revs). This quarter we tested those revised implementations on several systems (listed in reverse chronological order of their testing):
1. Xeon Phi with 2 Ivy Bridge. An Intel white box with two Ivy Bridge processors complemented by one Xeon Phi co-processor card. This is the second stack we have tested where the implementation code spreads the workload across both CPUs and accelerator cards. The first used 2 CPUs and 2 GPUs. Compared to that stack, this one with a single Xeon Phi exhibited faster execution of the end-to-end Greeks benchmark in warm runs (STAC-A2.beta2.GREEKS.TIME.WARM) and 27% higher max assets (STAC-A2.beta2.GREEKS.MAX_ASSETS). Access the report here:
2. 4-socket "Ivy Bridge EX". Another white box, containing four Ivy Bridge processors. The first STAC-A2 tests of Ivy Bridge EX. This system set new records. It was the fastest in the end-to-end Greeks benchmark (STAC-A2.beta2.GREEKS.TIME) of any system published to date, beating the average speed of the next fastest system (which used GPUs) by 34%. In the capacity tests, this system also handled 63% more assets and 58% more paths than the GPU-based system, which held the previous record. (Assets and paths are key dimensions of the problem size.) Access the STAC Report here:
3. 2-socket "Ivy Bridge EP". A white box server with simply two Ivy Bridge procs. The results were significantly better than those reported on a similar (but not identical) 2-socket system we tested last year: over 4x speedup for the average end-to-end Greeks benchmark, over 2x the paths capacity, and over 62% more assets capacity.
Qualified STAC members may access the detailed STAC Configuration Disclosures and micro-detailed SOS reports here for the systems above at the following links*:
* A subscriber from a firm with a premium membership in the STAC Benchmark Council should be permissioned automatically to the STAC Configuration Disclosure (make sure you are logged in). If you get an access-denied message and believe you are entitled to these documents, or if you'd like to ask about premium subscription options for your firm, please contact us. An Observer Member firm that has not already received a complimentary report may request access to the STAC Configuration Disclosure for either the Phi-based system, the 4-socket system, or the 2-socket system.