Global STAC Live, Fall 2020

STAC Summits bring together CTOs and other industry leaders responsible for solution architecture, infrastructure engineering, application development, machine learning/deep learning engineering, data engineering, and operational intelligence to discuss important technical challenges in trading and investment.

In fall of 2020 we again combined our New York, Chicago, and London events into a massive virtual event for the worldwide STAC Community.

Agenda

Click on the session titles to view the recording and slides.

Technology for time series analytics: The view from 2020

For decades, the finance industry has excelled at applying the latest computing advances to time series data analysis. The technical challenges to be mastered include capturing realtime data, serving it up as query responses or as datastreams, running increasingly advanced analytics on it, and providing a variety of users with displays and APIs that facilitate insight. What do today’s emerging technologies mean for time series analysis? What influence can or should we expect from the ML ecosystem, the latest visualization tools, the open source community, public clouds, and the lower end of the stack (memory, storage, processors, networks)? What business forces are driving a need for new technology? Could any of these innovations expand what business users think is possible? Sessions throughout Global STAC Live will present opinions and evidence on these questions. Our panel of thought leaders will kick it all off by providing their perspectives. Come to hear their thoughts and get your own questions into the mix.

Realtime data visualization

Businesses around the world spend billions of dollars each year on analytic tools, and many of these tools have sophisticated visualization capabilities. But most of them do not cater to the needs of trading and other fast-moving businesses. Realtime decision making requires realtime visualization, and most tools simply aren't designed for that. Yet there are some vendors answering the call, as well as components that promise to make it easier to build solutions. What are the challenges inherent in supporting users who need to visualize live streaming data or grok the state (right now!) of millions of datapoints? What technical capabilities are required in order to keep the data from overwhelming users? What role can AI play in creating useful realtime visualizations? What's possible today that wasn't before, and why? What should a firm consider when deciding whether to build or buy a visualization solution? Come to hear points of view from our panel of experts and put your own questions to them. To kick off, some of the panelists will provide a short presentation:

"Speed of Thought Analytics"
Larry Wienszczak, Business Development Director, Brytlyt ltd.

"Streaming analytics & ingest considerations for realtime visualizations"
Ferenc Bodon, Cloud Solutions Architect, Kx

"Building modern data systems with Deephaven"
Peter Goddard, CEO, Deephaven Data Labs

STAC Update: Historical time series stacks

Michel Debiche, Director of Analytics Research, STAC

Michel will present the latest benchmark results from software/hardware stacks for historical time series analytics.

Innovation Roundup

"All Flash To Make A.I. Fly and To Make Streaming Screaming: A Quick VAST Data Update"
Jeff Denworth, CMO and Co-Founder, VAST Data

"DDN: Data Acceleration Without Compromise"
James Coomer, Sr. Vice President Products, DDN Storage

"Computing at the Hot Edge"
Keith Manthey, CTO, Dell EMC

Rigorous benchmarks for streaming time series stacks.

Peter Nabicht, Head of Strategy, STAC

For over a decade, financial firms and tech vendors have used three STAC-M3 Benchmark suites to assess software and hardware stacks used to analyze historical market data. This quarter, STAC will release an additional STAC-M3 suite that assesses such stacks against streaming market data. In this talk, Peter will use preliminary results from two open source time series databases to discuss the motivations for these benchmarks, how they work, and the insights they can provide. He will also touch on how the benchmark framework and test harness software provide a basis for benchmarks of other streaming use cases, such as data center IT telemetry and industrial IOT.

Beam, streams, and finance

Neil Stevenson, Principal Solutions Architect, Hazelcast

Apache Beam is an open source, multi-language SDK for workflows that need to combine streaming and non-streaming content, with plugins (“runners”) for numerous data-processing systems. It offers a single abstraction that, in theory, protects applications from underlying API changes and enables an organization to swap out data-processing systems without requiring application rewrites. In this talk, Neil will review how Beam works and offer an assessment of its strengths and weaknesses from the perspective of a solution architect. He will also provide opinions on what to consider when choosing a Beam runner, with particular attention to finance use cases that require a high degree of parallelism and can benefit from a streaming approach, such as CVA.

How Uber tackled massive telemetry

Martin Mao, CEO, Chronosphere (former Technical Lead at Uber)

One of the most common time series workloads today is telemetry analytics for IT operations, and the workload is becoming more challenging every day. Systems built to monitor applications, servers, and network devices now have to contend with data from an increasing number of VMs, containers, microservices, and remote devices. At the same time, analytic demands are also increasing, as firms strive for better automation and "AIOps". Uber has faced these challenges at massive scale, needing realtime visibility into billions of production metrics, with hundreds of millions of metric updates per second and thousands of queries fetching billions of datapoints each second. Martin helped lead the team that created the telemetry system Uber built to meet this challenge. In this talk, he will describe how Uber's telemetry stack evolved over time, and how key aspects of the current architecture are designed to respond to business requirements for the foreseeable future.

An FPGA primer for software developers

Dr Matthew Grosvenor, Principal Engineer, Cisco

As FPGAs become part of the standard kit for an expanding range of low-latency, high-throughput applications, more and more software developers find themselves wanting (or needing) to develop FPGA logic to stay relevant. But circuit-based hardware is a whole new universe to someone with a career spent in an instruction-based world. Can a software developer understand how to program FPGA? Matthew thinks so. He has spent over a decade working in both worlds and thinks the key to FPGA is understanding some basic concepts. In this talk, Matt will offer software developers a jumping-off point by explaining how FPGAs work, how to make sense of FPGA products, and how FPGA programming languages reflect the underlying hardware concepts.

Innovation Roundup

"Kickstart your journey to trading in hardware"
Alastair Richardson, Director, Global Business Development, Xilinx

"How to Optimize an OpenCL Kernel Using Silexica's SLX FPGA"
Jordon Inkeles, VP Product, Silexica GmbH

"Instant Documentation of your FPGA design"
Bart Brosens, Application Engineer, Sigasi

STAC Fast Data Update

Peter Nabicht, Head of Strategy, STAC

Peter will announce the industry’s first FPGA Special Interest Group, a group of trading firms that will focus on common challenges in FPGA development, testing, and deployment.

Innovation Roundup

"AccuCore™ HCF (Hollow-Core Fiber) Gives High-Frequency Traders an Edge"
Daryl Inniss, Director, OFS Fitel, LLC

"Time at your service"
Francisco Girela López, Americas Tech Responsible, Seven Solutions

Agile FPGA?

Dr David Snowdon, Director of Engineering, Arista

In the 20 years since the Agile Manifesto was signed, the way we’ve developed and deployed software code has improved by leaps and bounds. Features can get out the door quickly while remaining reliable. But what about FPGA code? Even though FPGA is a popular platform choice in finance, deployment and management of FPGA logic is mired in the past. Can we achieve the rapid iteration, fast delivery times, and increased stability of software, while maintaining the benefits of hardware acceleration? Dave thinks we can. In his view, we should start thinking of FPGAs as just another place to deploy software. Using real-world examples, Dave will discuss how we can adapt the FPGA development process to take advantage of modern software engineering techniques and dev ops processes.

Innovation Roundup

"Enyx Product Update: Continuing innovation & lowering latency"
Laurent de Barry, Co-founder & Managing Director, Trading Solutions, Enyx

"Quantitative Results for Real-Time Predictive Signals"
David Taylor, Chief Technology Officer, Exegy, Inc.

"Transforming the costs and potential for market access"
Alex Stein, Global Head Business Development, Liquid-Markets-Holdings Inc.

Modern realtime data distribution

When middleware for realtime data distribution first appeared, the main mission was to get event-driven data from a small number of servers to a large number of trader desktops or gateways in "real time". All the machines were on site and had one or two processors. All the middleware was proprietary. "Real time" meant sub-second. Since then, a lot has changed. Humans consume data across a wide variety of devices and networks, and algorithms consume far more data than humans. "Real time" now ranges from sub-second to sub-microsecond. And producers or consumers of data may be in a public cloud. Our panel of experts will help us make sense of today's middleware situation and where it's headed. Given the latest advancements, how should a CTO, architect, or app developer think about what middleware is best for a given realtime problem? Where do requirements between use cases overlap, and what does that mean for integration of messaging systems? What role do open source products have to play? How does cloud change the picture, as a deployment context or delivery vehicle? What is the interplay of middleware with innovations in compute, networking, and memory? To kick off, some of the panelists will provide a short presentation:

"Last mile distribution: Real-time market data on Google Cloud"
Salvatore Sferrazza, Solutions Architect, Financial Services, Google

"CLLM 4.0 – Making reliable low latency multicast messaging even more predictable, deterministic and platform-agnostic"
Stefan Ott, Managing Director, CEO, Confinity Solutions GmbH

"Big Memory for Financial Services"
Andrew Degnan, VP Sales, MemVerge

"Elastic MDS: Bringing modern computing paradigms to market data"
Andrew MacGaffey, President, MetaFluent

Innovation Roundup

"Automated Service delivery in a Financial Ecosystem"
Peter Jeanmonod, Business Engineer, Options Information Technology

"AMD EPYC: Enabling Solutions for the Modern Datacenter"
Michael Detwiler, Senior Manager, Server Business Unit, AMD

Markets in the cloud: A game changer for both markets and clouds

Nikolai Larbalestier, SVP, Enterprise Architecture and Performance Engineering, NASDAQ

Last month, the Wall Street Journal cited Brad Peterson, CIO/CTO of Nasdaq, as saying that the tech and exchange company's North American and European markets will be hosted in the public cloud in about five years, and the rest will follow within ten. For those of us who have noticed that none of the major exchanges has put its matching engines in a public cloud so far, this raises an immediate question: Can a public cloud really meet the requirements of one of the world's premier exchanges? Few people care more about the answer to that question than Nikolai Larbalestier, who is responsible for Nasdaq's enterprise architecture, performance engineering, and capacity planning. In this talk, Nikolai will discuss the reasons for an exchange to move to the cloud, the significant technical gaps that today's cloud providers must fill in order to win exchange business, and the reason he thinks they will fill those gaps. He will also provide a vision of how cloud infrastructure will rewrite the ways that markets operate and will change the competitive dynamics of financial markets themselves. This talk promises to be thought-provoking for anyone interested in the future of the capital markets.

CloudEx: Experiments with building an exchange in the cloud

Dr Balaji Prabhakar, VMware Founders Professor of Computer Science, Stanford University & Co-Founder, Tick Tock Networks

A few months ago, a research group at Stanford University completed development of a prototype exchange that runs in a public cloud. Called CloudEx, the exchange acts as a testbed for new technological and algorithmic ideas. The matching engine and gateways are deployed on commodity cloud infrastructure and networks but utilize a high-accuracy “time perimeter” (see the Fall 2019 STAC Summit talks for background). In this talk, Balaji will quickly summarize the CloudEx architecture, then detail the findings on fairness and latency of both order handling and market data distribution in this initial prototype.

A sharper Arrow

Wes McKinney, Director, Ursa Labs

Apache Arrow is a cross-language development platform for in-memory columnar data processing. Among other things, it contains a language-independent columnar data format designed for fast transport and processing on modern hardware. Now in the top 15 most active Apache projects, Arrow has over 500 contributors and over two million weekly downloads. With the Python and R communities using it to facilitate CPU and GPU computations, as well as uptake by major cloud data warehouses, Arrow has a shot (pun intended) at becoming a universal “native” format for data-intensive computing. As a co-creator of Arrow and a key member of its Project Management Committee, Wes will provide a progress update with particular focus on enhancements that can benefit high-throughput data services, data exchange in distributed systems, and C++ analytics on time series, both in-memory and out-of-core. Come to get up to date and ask Wes your questions.

DPDK: A multi-purpose tool for realtime business

Pawel Szopinski, VP, Specialised Infrastructure Engineering, Barclays

The Linux Foundation's Data Plane Development Kit (DPDK) is a set of libraries and network interface drivers to accelerate packet processing workloads running with a wide variety of network cards and CPU architectures. In Pawel's opinion, once DPDK is well understood, it can become indispensable tool. In this talk, he will walk us through the core components and architecture of DPDK, point out some helpful open-source implementations, and discuss potential use cases. Whether you are new to DPDK or are looking to take better advantage of it, Pawel's talk is for you.

Innovation Roundup

"Move Data Faster: Intel Ethernet 800 Series Flexibility & Programmability"
Gary Gumanow, Technical Sales Specialist – NA Channel Ethernet Adapters, Intel

"Need super-fast math? Meet the HPE Apollo 80 with Fujitsu A64FX processor."
Dr. Tom Bradicich, VP and Hewlett Packard Fellow, Hewlett Packard Enterprise

STAC Update for simulation stacks

Michel Debiche, Director of Analytics Research, STAC

Michel will present the latest benchmark results from technology stacks for compute-intensive simulations.

The trader of the future: Smarter, faster, and even more Pythonic

Dr John Ashley, General Manager, Financial Services and Technology, NVIDIA

William Gibson famously said that "the future is already here--it's just not evenly distributed". In John's view, that maxim applies to trading desks as much as any other part of life. Some desks continue doing what has worked for the last decade, hand-crafting algorithms on a relatively narrow slice of market data in, say, Excel or C++. Another group is expanding their data sources and exploiting the benefits of the Python ecosystem, including ML/AI tools. John will argue that the future belongs to the latter. He will explain how Python makes it easy for traders to create smarter strategies, and he'll offer tips on how to get smarter faster. By smarter, he means taking in more contextual data and exploiting the latest ML techniques, which he'll demonstrate by showing how a BERT-class model can turn natural language into insights. By faster, John means both faster to market and faster in the market. To accelerate strategy development and backtesting, he will share some best practices and Python community resources. To get ML models to react more quickly to markets, John will discuss the importance of optimizing inference and how to do it. Come to see this view of the future and ask your questions about it.

New approaches for enabling machine learning innovation

Alexander Tsyplikhin, Senior AI Engineer, Graphcore

AI is an increasingly important competitive differentiator for financial firms. But according to Alex, the traditional constraints of current hardware architectures often force firms to oversimplify models in order to reduce latency and fight noise. These compromises are especially limiting as model sizes continue to grow. In Alex's view, if hardware constraints can be relaxed sufficiently, then AI practitioners will be able to use new techniques in order to develop more advanced models with higher accuracy. In this talk, Alex will detail some of these new approaches to machine intelligence and explain how more complex financial models can enable deeper insights faster.

Benchmarking realtime LSTM inference on time series

Michel Debiche, Director of Analytics Research, STAC

Many of the performance challenges associated with machine learning (ML) are in the training phase: that is, model development. But for certain use cases like time-sensitive trading, the inference phase (the application of a model to new data) also has its challenges. For example, if a model provides realtime input to trading algorithms--or perhaps even does the trading itself--inference latency may be a prime consideration. And if the model is deployed in a resource-constrained environment like a colocation center, throughput per unit of resource is probably also a concern. In this talk, Michel will present proposed STAC benchmarks for realtime inference, using an example with long short-term memory (LSTM) networks. He will argue that benchmarks of inference latency will not only inform sizing decisions for well-established ML techniques but will also be a key factor in deciding whether less proven ML techniques can or cannot be applied effectively to various financial use cases.

How to choose AI platforms

Imagine the scenario: A group within your firm wants to explore the use of a particular machine learning technique for a given problem, with an intent to take anything successful into production. Your job is to supply them with the platform(s) they should use for both training and inference. How should you go about this? The number of AI platforms is growing steadily (CPUs, GPUs, FPGAs, vector engines, specialized processors, bare metal/cloud…). Given that different platforms are probably good for different tasks, what is the right way to frame the problem? Are we getting to a point where every new modeling challenge requires a customized hardware evaluation? What role can industry benchmarks play? Beyond speeds and feeds, what are the key business factors to keep in mind? Do small firms have an advantage or can large firms act nimbly as well? To what extent do cloud offerings make the problem easier or harder? Our panel of experts will discuss.

Must serverless = stateless?

Anurag Khandelwal, Assistant Professor of Computer Science, Yale University

"Serverless" has become a sexy term that vendors use in many different ways. However, the original innovation that inspired the label is functions as a service (FaaS). FaaS breaks from the paradigm of the constantly-running application on a known virtual or physical server. Instead, developers design applications as discrete functions, each of which is implemented in a cloud framework that invokes a function in response to events and only charges for the time that the function executes. All the major cloud providers offer FaaS today, and at previous STAC summits some user firms have said they cut their spend by 90% on major applications by moving to FaaS. In this talk, Anurag will review the kinds of workloads currently suited to FaaS, as well as the limitations that make it unusable for other workloads--in particular, stateful apps. To do this, he will discuss the backend services required to make FaaS work today and argue that new backend services now in development will make FaaS viable for stateful apps and even streaming analytics. If you need to think about your application architecture a couple of years out, this talk should give you plenty to consider.

About STAC Events & Meetings

STAC events bring together CTOs and other industry leaders responsible for solution architecture, infrastructure engineering, application development, machine learning/deep learning engineering, data engineering, and operational intelligence to discuss important technical challenges in financial services.

Get the event app