## SILEXICA

# How to Optimize an OpenCL Kernel Using Silexica's SLX FPGA

Xilinx's Vitis Software Platform

Jordon Inkeles inkeles@silexica.com
October 2020

#### How to Optimize a Kernel in Vitis with SLX FPGA

- Goal: Accelerate a financial algorithm onto an FPGA from C/C++ source
- Design: V-model (Vasicek)

   Used in the valuation of interest rate derivatives
- Framework: OpenCL (XRT) framework running on an Xilinx Alveo U200 Card
- Tools: Silexica's SLX FPGA Tool & Xilinx's Vivado/Vitis Tools





#### Silexica SLX FPGA

- SLX FPGA sits on top of HLS compiler
  - Prepares the C/C++ code for optimum HLS results
  - Takes the guesswork out of using HLS
- Removes the roadblocks in HLS adoption
  - Non-synthesizable C/C++ code
  - Finding parallelism
  - Poor performance and bloated area
- HW engineers: Get SW guidance needed
- SW engineers: Get parallelism/HW guidance



#### Import Vivado Project into SLX FPGA





**SLX FPGA Vivado Project Importer** 

**SLX FPGA Configuration Editor** 

### **Analyze the Design with SLX FPGA**





## **Analyze the Design with SLX FPGA**





### **Analyze the Design with SLX FPGA**

```
0.04% void dut(int endCnt, DT time[LEN], DT dtime[LEN], DT flatRate, DT spread, DT a, DT sigma, DT x0, DT b, DT* discount) {
39
              #pragma HLS INTERFACE s axilite port=discount
40
              #pragma HLS INTERFACE m axi depth=16 port=dtime
              #pragma HLS INTERFACE m axi depth=16 port=time
41
42
              #pragma HLS INTERFACE s axilite port=return
43
              #pragma HLS INTERFACE s axilite port=b
44
              #pragma HLS INTERFACE s axilite port=x0
              #pragma HLS INTERFACE s axilite port=sigma
45
46
              #pragma HLS INTERFACE s_axilite port=a
              #pragma HLS INTERFACE s_axilite port=spread
47
48
              #pragma HLS INTERFACE s axilite port=flatRate
49
              #pragma HLS INTERFACE port=endCnt
                  DT tmp values1[4][LEN2];
50
51
                  DT tmp values2[4][LEN2];
52
                  DT rates[LEN];
53
       0.12%
                  Model model:
       0.27%
                  Tree tree;
       0.08%
                 DT process[4] = \{a, sigma, 0.0, 0.0\};
                  tree.initialization(process, endCnt, x0);
       5.57%
       0.39%
                  model.initialization(flatRate, spread, a, sigma, b);
      56.06%
                  model.treeShortRate(tree, endCnt, time, dtime, tmp values1, tmp values2, tmp values1[3], rates);
                  DT x = tree.underlying(0);
       0.59%
       0.68%
                  *discount = model.discount(time[endCnt - 2], dtime[endCnt - 2], &x, rates[endCnt - 2]);
              #ifndef SYNTHESIS
62
                  std::cout << "i=" << endCnt - 2 << ",x=" << x << ",rates[i]=" << rates[endCnt - 2] << ",disc=" << *discount
63
      21.73%
64
                            << std::endl:
              #endif
65
66
```



#### **Automated Flow & Optimization Results**





### **Automated Flow & Optimization Results**

| Version                                  | Latency | LUT    | FF     | DSP | BRAM |
|------------------------------------------|---------|--------|--------|-----|------|
| Hand-Optimized (from the Vitis Library ) | 1823    | 21,939 | 19,891 | 114 | 12   |
| SLX FPGA Optimized                       | 1168    | 13,789 | 11,671 | 37  | 9    |
| SLX Improvement                          | 36%     | 38%    | 42%    | 68% | 25%  |



#### Finish with Vitis Alveo project





#### www.silexica.com/documentation

