

## **An FPGA Primer**

for Software Engineers

Dr Matthew Grosvenor Principal Engineer, Cisco



## **High Frequency Trading**

## Network Offload / Acceleration

**Computational Storage** 

High Performance Compute/ AI / ML

01

01000



int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;



| int a | = 1;      |          |
|-------|-----------|----------|
| int b | = 0;      |          |
| •     |           | Compiler |
| int o | ut_1;     |          |
| TUC O | ut_2;     |          |
| out 1 | = ~(a &   | b):      |
| Out_2 | $= out_1$ | & b;     |
|       |           |          |

```
.long 1
a:
b:
       .zero 4
out 1: .zero 4
Out 2: .zero 4
mov eax, DWORD PTR a [rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out_1[rip], edx
```

not eax and eax, DWORD PTR b[rip] mov DWORD PTR out\_2[rip], eax

```
.long 1
a:
b:
       .zero 4
out 1: .zero 4
Out 2: .zero 4
mov eax, DWORD PTR a rip
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out_1[rip], edx
not eax
and eax, DWORD PTR b[rip]
mov DWORD PTR out 2[rip], eax
```

```
.long 1
a:
b:
       .zero 4
out 1: .zero 4
Out 2: .zero 4
mov eax, DWORD PTR a rip
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out 1[rip], edx
not eax
and eax, DWORD PTR b[rip]
mov DWORD PTR out 2[rip], eax
```



RAM

```
.long 1
a:
b:
       .zero 4
                          ELF
out 1: .zero 4
Out 2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out 1[rip], edx
not eax
and eax, DWORD PTR b[rip]
```

mov DWORD PTR out 2[rip], eax

mov eax, .. mov edx 0x000001 0x000000 0x000000 0x000000 RAM

```
a: .long 1
b: .zero 4
```

```
out_1: .zero 4
Out_2: .zero 4
```

```
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out_1[rip], edx
```

```
not eax
and eax, DWORD PTR b[rip]
mov DWORD PTR out_2[rip], eax
```

mov eax, .. mov edx 0x000001 0x000000 0x000000 0x000000

RAM Code section

```
.long 1
a:
b:
       .zero 4
out 1: .zero 4
Out 2: .zero 4
mov eax, DWORD PTR a rip
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out 1[rip], edx
```

```
not eax
and eax, DWORD PTR b[rip]
mov DWORD PTR out_2[rip], eax
```





RAM

## **Data section**

| a: .long 1<br>b: .zero 4                                                    | mov eax,<br>mov edx      | RAM       |
|-----------------------------------------------------------------------------|--------------------------|-----------|
| out_1: .zero 4<br>Out_2: .zero 4                                            | a 0x000001<br>b 0x000000 |           |
| <pre>mov eax, DWORD PTR a[rip] mov edx, eax and edx, DWORD PTR b[rip]</pre> | out_2 0x000000           | Registers |
| <pre>not edx mov DWORD PTR out_1[rip], edx</pre>                            |                          | A         |
| <pre>not eax and eax, DWORD PTR b[rip] mov DWORD PTR out_2[rip], eax</pre>  |                          | РС        |

| a: .long 1<br>b: .zero 4                                                   | mov eax,<br>mov edx                        | RAM       |
|----------------------------------------------------------------------------|--------------------------------------------|-----------|
| out_1: .zero 4<br>Out_2: .zero 4                                           | a 0x000001<br>b 0x000000<br>out 1 0x000000 |           |
| <pre>mov eax, DWORD PTR a[rip] mov edx, eax</pre>                          | out_2 0x000000                             | Registers |
| and edx, DWORD PTR b[rip]<br>not edx                                       |                                            | A         |
| <pre>mov DWORD PTR out_1[rip], edx</pre>                                   |                                            | D         |
| <pre>not eax and eax, DWORD PTR b[rip] mov DWORD PTR out_2[rip], eax</pre> | 0x00                                       | PC        |

| a:<br>b:          | .long 1<br>.zero 4                                    |     |            |
|-------------------|-------------------------------------------------------|-----|------------|
| out_<br>Out_      | _1: .zero 4<br>_2: .zero 4                            |     |            |
| mov               | eax, DWORD PTR a[rip]                                 |     | out<br>out |
| and               | edx, eax<br>edx, DWORD PTR b[rip]                     |     |            |
| not<br>mov        | edx<br>DWORD PTR out_1[rip],                          | edx |            |
| not<br>and<br>mov | eax<br>eax, DWORD PTR b[rip]<br>DWORD PTR out_2[rip], | eax |            |



| a: .long<br>b: .zero                                | 1<br>4                          | mov ea<br>mov ed                     | x,<br>Ix       | RAM       |
|-----------------------------------------------------|---------------------------------|--------------------------------------|----------------|-----------|
| out_1: .zero<br>Out_2: .zero                        | 4<br>4                          | a 0x0000<br>b 0x0000<br>out 1 0x0000 | 01<br>00<br>00 |           |
| <pre>mov eax, DWOR mov edx, eax and edx, DWOR</pre> | D PTR a[rip]<br>D PTR b[rip]    | out_2 0x0000                         | 00             | Registers |
| not edx<br>mov DWORD PTR                            | out_1[rip], edx                 |                                      | x000001        | A         |
| not eax<br>and eax, DWOR<br>mov DWORD PTR           | D PTR b[rip]<br>out_2[rip], eax |                                      | 0x01           | PC        |

| a:<br>b:          | .long 1<br>.zero 4                                         | mov eax,<br>mov edx      | RAM       |
|-------------------|------------------------------------------------------------|--------------------------|-----------|
| out<br>Out        | _1: .zero 4<br>_2: .zero 4                                 | a 0x000001<br>b 0x000000 |           |
| mov<br>mov<br>and | eax, DWORD PTR a[rip]<br>edx, eax<br>edx, DWORD PTR b[rip] | out_2 0x000000           | Registers |
| not<br>mov        | edx<br>DWORD PTR out_1[rip], ed                            | 0x000001                 | А         |
| not               | eax                                                        | 0x000001                 | D         |
| and<br>mov        | <pre>eax, DWORD PTR b[rip] DWORD PTR out_2[rip], ea</pre>  | 0x01                     | РС        |































A Program [rip] "A sequence of instructions given to a CPU to get a result" int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b; int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out 2 = out 1 & b; input a, b, clk; output r\_out1, r\_out2; reg r\_out1, r\_out2;

wire out1,
wire out2;

```
assign out1 = ~(a & b);
assign out2 = out1 & b;
```

```
always @(posedge clk)
begin
    r_out1 <= out1;
    r_out2 <= out2;
end</pre>
```





end






## Lifting the lid on FPGAs



## **Fundamentals of digital logic**



| Bit | wise NOT<br>(~) | Γ           |
|-----|-----------------|-------------|
| Α   | Out             |             |
| 0   | 1               | Truth table |
| -   | J               |             |

| Bit    | Bitwise NOT<br>(~) |  |  |  |  |  |
|--------|--------------------|--|--|--|--|--|
| Α      | Out                |  |  |  |  |  |
| 0<br>1 | 1<br>0             |  |  |  |  |  |
|        |                    |  |  |  |  |  |



| Bit | Bitwise NOT<br>(~) |  | Bitwise AND<br>(&) |        |        |  |
|-----|--------------------|--|--------------------|--------|--------|--|
| А   | Out                |  | Α                  | B      | Out    |  |
| 0   | 1<br>0             |  | 0<br>1             | 0<br>0 | 0<br>0 |  |
|     |                    |  | 0<br>1             | 1<br>1 | 0<br>1 |  |



| Bitwise NOT<br>(~) |        |     | Bitw             | /ise /<br>(&) | AND              |     |
|--------------------|--------|-----|------------------|---------------|------------------|-----|
| Α                  | Out    |     | Α                | В             | Out              |     |
| 0<br>1             | 1<br>0 |     | 0<br>1<br>0<br>1 | 0<br>0<br>1   | 0<br>0<br>0<br>1 |     |
| A                  |        | Dut | A<br>B           | -             |                  | Out |

| Bi | wise NOT Bitwise AND<br>(~) (&) |    |        | Bitwise OR<br>( ) |     |     |        |   |              |    |
|----|---------------------------------|----|--------|-------------------|-----|-----|--------|---|--------------|----|
| Α  | Out                             |    | Α      | B                 | Out |     | Α      | B | Out          |    |
| 0  | 1                               |    | 0      | 0                 | 0   |     | 0      | 0 | 0            |    |
| 1  | 0                               |    | 1      | 0                 | 0   |     | 1      | 0 | 1            |    |
|    |                                 |    | 0      | 1                 | 0   |     | 0      | 1 | 1            |    |
|    |                                 |    | 1      | 1                 | 1   |     | 1      | 1 | 1            |    |
| A  | <b>&gt;</b> 0− 0                | ut | A<br>B |                   | )   | Out | A<br>B | 5 | <b>)</b> — o | ut |







| Α | B | Out |
|---|---|-----|
| 0 | 0 | 0   |
| 1 | 0 | 0   |
| 0 | 1 | 0   |
| 1 | 1 | 1   |



Out







Out









## Truth Table A B Out 0 x 0 0 1 0 1 1 1











# Truth Table A B Out 0 x 0 x 1 0 x 1 1 "x" could be 0 or 1



#### © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential



| Α | B | Out |
|---|---|-----|
| x | 0 | 0   |
| x | 1 | X   |



| Α      | S      | Out    |
|--------|--------|--------|
| x<br>x | 0<br>1 | 0<br>x |
|        |        |        |



| Α | S | Out |
|---|---|-----|
| х | 0 | 0   |
| Х | 1 | х   |
|   |   |     |



| Α | S | Out |
|---|---|-----|
| х | 0 | 0   |
| Х | 1 | Х   |
|   |   |     |



## A S Out x 0 0 x 1 x





| Α      | S      | Out1   | В      | S      | Out2   |
|--------|--------|--------|--------|--------|--------|
| X<br>X | 0<br>1 | 0<br>x | y<br>y | 0<br>1 | 0<br>y |
|        |        |        |        |        |        |









| Α | C B         | Out1                | Out2               |
|---|-------------|---------------------|--------------------|
| х | У           | 0                   | 0                  |
| Х | У           | Х                   | У                  |
|   |             |                     |                    |
|   | A<br>x<br>x | A C B<br>x y<br>x y | A C BOut1xy0xy1xy1 |
















## © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential























Truth TableS A BOut0 x y x1 x y y

Multiplexer (MUX): "Different inputs are selected(or routed) to a common output based on a selector"











































## **AND Gate**









**OR Gate** 









**Truth Table** 

 $S_1 S_2 Out$ 

| 0 | 0 | 1 |
|---|---|---|
| 1 | 0 | 1 |
| 0 | 1 | 1 |
| 1 | 1 | 0 |

NOT AND (NAND)

## **Programable Gate:** "different constants can implement any logical Х function with 2 inputs" Ζ
















## Field Programable Gate Array: "An array of programmable gates that can be updated in the field"









































## Combinational Circuit: "Any circuit composed of only logical gates."






























#### 

















## Glitching: $T_5 T_4 T_3 T_2 T_1 T_0$ 0 0 1 1 1 1 "Intermediate outputs while circuit to stabilizes to a final output over time"





**Intermediate outputs** 







Sampled output 11Sampled output 20









# ClockOutput 1 "A source of pulses at regular intervals. The clock speed is measured in Hz"





























Synchronous Circuit: "Outputs are sampled or 'registered' at intervals ample point determined by a clock signal"

Sampled output 1

Sampled output 2
# Putting all the pieces together













|                                             | INX.<br>IRTEX.<br>UltrasCALET |              |              |              |              |              |       |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            | 6            | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                             | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                             | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                             | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                             | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                            | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                             | -            | -            | -            | _            | -            | -     |
| 100G / 50G KP4 FEC                          | _                             | -            | _            | -            | _            | -            | _     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                             | INX.<br>IRTEX.<br>UlitaSCALET |              |              |              |              |              |       |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            | 6            | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | _                             | -            | _            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                             | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                             | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                             | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                            | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                             | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | _                             | -            | _            | -            | _            | -            | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |



|                                             | NX.<br>RTEX.<br>UltraSCALET |              |              |              |              |              |       |
|---------------------------------------------|-----------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                        | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                         | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flop <u>s (K)</u>                  | 788                         | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUT <mark>s (K)</mark>                  | 394                         | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                        | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                        | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                        | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                       | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                         | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                           | 4            | 4            | 6            | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                           | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                           | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                           | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                         | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                           | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                          | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                           | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | -                           | -            | -            | -            | -            | -            | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                       | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                             | NX.<br>RTEX.<br>UIItraSCALET |              |                |                           |                   |              |                   |
|---------------------------------------------|------------------------------|--------------|----------------|---------------------------|-------------------|--------------|-------------------|
| Device Name                                 | VU3P                         | VU5P         | VU7P           | VU9P                      | VU11P             | VU13P        | VU19P             |
| System Logic Cells (K)                      | 862                          | 1,314        | 1,724          | 2,586                     | 2,835             | 3,780        | 8,938             |
| CLB Flip <mark>-Flops (K)</mark>            | 788                          | 1.201        | 1.576          | 2.364                     | 2.592             | 3.456        | 8.172             |
| CLE LUTs (K)                                | 394,000                      | 601,000      | <b>788,000</b> | <b>1,182</b> , <b>000</b> | 1,296, <b>000</b> | 1,728,000    | 4,086, <b>000</b> |
| Max. Dist. RAM (Mb)                         | 12.0                         | 18.3         | 24.1           | 36.1                      | 36.2              | 48.3         | 58.4              |
| Total Block RAM (Mb)                        | 25.3                         | 36.0         | 50.6           | 75.9                      | 70.9              | 94.5         | 75.9              |
| UltraRAM (Mb)                               | 90.0                         | 132.2        | 180.0          | 270.0                     | 270.0             | 360.0        | 90.0              |
| DSP Slices                                  | 2,280                        | 3,474        | 4,560          | 6,840                     | 9,216             | 12,288       | 3,840             |
| Peak INT8 DSP (TOP/s)                       | 7.1                          | 10.8         | 14.2           | 21.3                      | 28.7              | 38.3         | 10.4              |
| PCle <sup>®</sup> Gen3 x16                  | 2                            | 4            | 4              | 6                         | 3                 | 4            | 0                 |
| PCIe Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                            | -            | _              | -                         | -                 | -            | 8                 |
| 150G Interlaken                             | 3                            | 4            | 6              | 9                         | 6                 | 8            | 0                 |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                            | 4            | 6              | 9                         | 9                 | 12           | 0                 |
| Max. Single-Ended HP I/Os                   | 520                          | 832          | 832            | 832                       | 624               | 832          | 1,976             |
| Max. Single-Ended HD I/Os                   | 0                            | 0            | 0              | 0                         | 0                 | 0            | 96                |
| GTY 32.75Gb/s Transceivers                  | 40                           | 80           | 80             | 120                       | 96                | 128          | 80                |
| GTM 58Gb/s PAM4 Transceivers                | -                            | -            | _              | -                         | -                 | -            | -                 |
| 100G / 50G KP4 FEC                          | _                            | _            | _              | -                         | -                 | -            | _                 |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                 | -1 -2 -2L -3 | -1 -2 -2L -3   | -1 -2 -2L -3              | -1 -2 -2L -3      | -1 -2 -2L -3 | -1 -2             |
| Industrial                                  | -1 -2                        | -1 -2        | -1 -2          | -1 -2                     | -1 -2             | -1 -2        | -                 |

|                                             | NX.<br>RTEX.<br>UltraSCALET |              |              |              |              |                   |       |
|---------------------------------------------|-----------------------------|--------------|--------------|--------------|--------------|-------------------|-------|
| Device Name                                 | VU3P                        | VU5P         | VU7P         | VU9P         | VU11P        | VU13P             | VU19P |
| System Logic Cells (K)                      | 862                         | 1,314        | 1,724        | 2,586        | 2,835        | 3,780             | 8,938 |
| CLB Flip-Flops (K)                          | 788                         | 1,201        | 1,576        | 2,304        | 2,592        | 3,45 <del>6</del> | 8,172 |
| CLB LUTs (K)                                | 394                         | 601          | 788          | 1,182        | 1,196        | 1, 28 🧲           | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                        | 18.3         | 24.1         | 36.1         | 30.2         |                   | 58.4  |
| Total Block RAM (Mb)                        | 25.3                        | 36.0         | 50.6         | 75.9         | 70.9         | 94.5              | 75.9  |
| UltraRAM (Mb)                               | 90.0                        | 132.2        | 180.0        | 270.0        | 270.0        | 360.0             | 90.0  |
| DSP Slices                                  | 2,280                       | 3,474        | 4,560        | 6,840        | 9,216        | 12,288            | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                         | 10.8         | 14.2         | 21.3         | 28.7         | 38.3              | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                           | 4            | 4            | 6            | 3            | 4                 | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                           | -            | -            | -            | -            | -                 | 8     |
| 150G Interlaken                             | 3                           | 4            | 6            | 9            | 6            | 8                 | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                           | 4            | 6            | 9            | 9            | 12                | 0     |
| Max. Single-Ended HP I/Os                   | 520                         | 832          | 832          | 832          | 624          | 832               | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                           | 0            | 0            | 0            | 0            | 0                 | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                          | 80           | 80           | 120          | 96           | 128               | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                           | -            | -            | -            | -            | -                 | -     |
| 100G / 50G KP4 FEC                          | -                           | -            | -            | -            | -            | -                 | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3      | -1 -2 |
| Industrial                                  | -1 -2                       | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2             | _     |



| WAR AND NO                                  | NX.<br>RTEX.<br>UIItraSCALET |              |              |              |              |              |       |
|---------------------------------------------|------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                         | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                          | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                          | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                          | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                         | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                         | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                         | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                        | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                          | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                            | 4            | 4            | 6            | 3            | 4            | 0     |
| PCIe Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                            | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                            | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                            | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                          | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                            | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                           | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                            | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | _                            | -            | -            | -            | -            | -            | _     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                             | INX.<br>IRTEX.<br>UlitaSCALET |              |              |              |              |              |                     |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|---------------------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P               |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938               |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172               |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086               |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4                |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9                |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0                |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6 840        |              | 12,288       | 3,84 <mark>0</mark> |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         | 21.3         | 287          | 383          | 16.1                |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            | 6            | 3            | 4            |                     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                             | _            | -            | -            | _            | -            | 8                   |
| 150G Interlaken                             | 3                             | 4            | 6            | 9            | 6            | 8            | 0                   |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                             | 4            | 6            | 9            | 9            | 12           | 0                   |
| Max. Single-Ended HP I/Os                   | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976               |
| Max. Single-Ended HD I/Os                   | 0                             | 0            | 0            | 0            | 0            | 0            | 96                  |
| GTY 32.75Gb/s Transceivers                  | 40                            | 80           | 80           | 120          | 96           | 128          | 80                  |
| GTM 58Gb/s PAM4 Transceivers                | -                             | _            | -            | -            | -            | -            | -                   |
| 100G / 50G KP4 FEC                          | -                             | -            | -            | -            | -            | -            | -                   |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2               |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -                   |

|                                                      | INX.<br>IRTEX.<br>UltraSCALET |              |                                   |              |              |                              |                  |
|------------------------------------------------------|-------------------------------|--------------|-----------------------------------|--------------|--------------|------------------------------|------------------|
| Device Name                                          | VU3P                          | VU5P         | VU7P                              | VU9P         | VU11P        | VU13P                        | VU19P            |
| System Logic System Logic                            | 862                           | 1,314        | 1,724                             | 2,586        | 2,835        | 3,780                        | 8,938            |
| CLB Flip-Flops (K)                                   | 788                           | 1,201        | 1,576                             | 2,364        | 2,592        | 3,456                        | 8,172            |
| CLB LUTs (K)                                         | 394                           | 601          | 788                               | 1,182        | 1,296        | 1,728                        | 4,086            |
| Max. <mark>F</mark> ist. RAM (Mb)                    | 12.0                          | 18.3         | 24.1                              | 36.1         | 36.2         | 48.3                         | 58.4             |
| Total Bock RAM (Mb)                                  | 25.3                          | 36.0         | 50.6                              | 75.9         | 70.9         | 94.5                         | 75.9             |
| UtraRAM (Mb)                                         | 90.0                          | 132.2        | 180.0                             | 270.0        | 270.0        | 360.0                        | 90.0             |
| DSP Slice<br>Peak INT8 DSP (TCP/s)<br>PCle® Gen3 ×16 | ′4 <sup>2,28</sup> 1          | put L        | U <sup>4,66</sup><br>1.2 <b>S</b> | 6 11         | nput         | L <sup>12,188</sup><br>393 S | 3,840<br>10<br>0 |
| PCle Gen3 x16/Gen4 x8 / CCiX <sup>(1)</sup>          | -                             | -            | -                                 | -            | -            | -                            | 8                |
| 150G Interlaken                                      | 3                             | 4            | 6                                 | 9            | 6            | 8                            | 0                |
| 100G Ethernet w/ KR4 RS-FEC                          | 3                             | 4            | 6                                 | 9            | 9            | 12                           | 0                |
| Max. Single-Ended HP I/Os                            | 520                           | 832          | 832                               | 832          | 624          | 832                          | 1,976            |
| Max. Single-Ended HD I/Os                            | 0                             | 0            | 0                                 | 0            | 0            | 0                            | 96               |
| GTY 32.75Gb/s Transceivers                           | 40                            | 80           | 80                                | 120          | 96           | 128                          | 80               |
| GTM 58Gb/s PAM4 Transceivers                         | _                             | _            | _                                 | _            | _            | _                            | _                |
| 100G / 50G KP4 FEC                                   | -                             | -            | -                                 | -            | -            | -                            | -                |
| Extended <sup>(2)</sup>                              | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3                      | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3                 | -1 -2            |
| Industrial                                           | -1 -2                         | -1 -2        | -1 -2                             | -1 -2        | -1 -2        | -1 -2                        | -                |

|                                             | NX.<br>RTEX.<br>UltraSCALET |              |              |              |              |              |       |
|---------------------------------------------|-----------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                        | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                         | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                         | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                         | 601          | /88          | 1,182        | 1,290        | 1,/28        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                        | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                        | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                        | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                       | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                         | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                           | 4            | 4            | 6            | 3            | 4            | 0     |
| PCIe Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                           | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                           | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                           | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                         | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                           | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                          | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                           | -            | -            | -            | -            | -            | _     |
| 100G / 50G KP4 FEC                          | _                           | -            | _            | -            | -            | -            | _     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                       | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |



|                                             | INX.<br>IRTEX.<br>UltraSCALET |              |              |              |              |              |       |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         |              | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            |              | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                             | -            | -            |              | -            | -            | 8     |
| 150G Interlaken                             | 3                             | 4            | 6            |              | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                             | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                             | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                            | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                             | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | _                             | -            | -            | -            | -            | -            | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | _     |

# FPGA'S Naughty little secret







#### 2 input LUT:



## 2 input LUT: -1x 4-way mux + 4x 1bit memory



- 2 input LUT:
- -1x 4-way mux + 4x 1bit memory
- 3x 2-way mux + 4x 1bit memory



- 2 input LUT:
- -1x 4-way mux + 4x 1bit memory
- 3x 2-way mux + 4x 1bit memory
- 3x 4 gates + 4x 1 bit memory



- 2 input LUT:
- -1x 4-way mux + 4x 1bit memory
- 3x 2-way mux + 4x 1bit memory
- 3x 4 gates + 4x 1 bit memory
  - 3x(7) + 4x(4) transistors



2 input LUT: - 37 transistors!



#### 2 logic gate circuit



- 2 logic gate circuit
- 2x 2 input LUTs



#### 2 logic gate circuit

- 2x 2 input LUTs
- 11 programmable connections



#### 2 logic gate circuit

- 2x 2 input LUTs +11 connections
- 2x 37 + 11x4 transistors!!!



2 logic gate circuit-118 transistors



#### FPGA - ~118 transistors



# FPGA - ~118 transistors - Fully flexible!



# FPGA - ~118 transistors - Fully flexible!

Direct "hard" (ASIC)

- ~4 transistors!!!
- fixed config
# **FPGAs:** "Offer high flexibility in exchange for low efficiency"

|                                             | INX.<br>IRTEX.<br>UltrasCALET |              |              |              |              |              |       |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0  |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            | 6            | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                             | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                             | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                             | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                             | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                            | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                             | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | -                             | -            | -            | -            | -            | -            | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                                              | INX.<br>IRTEX.<br>UlitraSCALET |              |              |              |              |              |       |
|--------------------------------------------------------------|--------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                                  | VU3P                           | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                                       | 862                            | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                                           | 788                            | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                                 | 394                            | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)<br>Total Block RAM (Mb)<br>UltraRAM (Mb) | Efficie                        | nt "har      | d" func      | tion imp     | olement      | tations      |       |
| DSP Slices                                                   | 2,280                          | 3,474        | 4,560        | 6,84U        | 9,216        | 12,288       | 3,840 |
| Peak INT8 DSP (TOP/s)                                        | 7.1                            | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4  |
| PCle <sup>®</sup> Gen3 x16                                   | 2                              | 4            | 4            | 6            | 3            | 4            | 0     |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup>                  | -                              | -            | -            | -            | _            | -            | 8     |
| 150G Interlaken                                              | 3                              | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                                  | 3                              | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                                    | 520                            | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                                    | 0                              | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                                   | 40                             | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                                 | -                              | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                                           | -                              | -            | _            | -            | _            | _            | _     |
| Extended <sup>(2)</sup>                                      | -1 -2 -2L -3                   | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                                   | -1 -2                          | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                                              | INX.<br>IRTEX.<br>UltraSCALET |              |              |              |              |              |        |
|--------------------------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|--------|
| Device Name                                                  | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P  |
| System Logic Cells (K)                                       | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938  |
| CLB Flip-Flops (K)                                           | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172  |
| CLB LUTs (K)                                                 | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086  |
| Max. Dist. RAM (Mb)<br>Total Block RAM (Mb)<br>UltraRAM (Mb) | Differ                        | ent typ      | es/den       | sity of m    | nemory       | (RAM)        | )<br>) |
| DSP Slices                                                   | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840  |
| Peak INT8 DSP (TOP/s)                                        | 7.1                           | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4   |
| PCle <sup>®</sup> Gen3 x16                                   | 2                             | 4            | 4            | 6            | 3            | 4            | 0      |
| PCIe Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup>                  | -                             | -            | -            | -            | -            | -            | 8      |
| 150G Interlaken                                              | 3                             | 4            | 6            | 9            | 6            | 8            | 0      |
| 100G Ethernet w/ KR4 RS-FEC                                  | 3                             | 4            | 6            | 9            | 9            | 12           | 0      |
| Max. Single-Ended HP I/Os                                    | 520                           | 832          | 832          | 832          | 624          | 832          | 1,976  |
| Max. Single-Ended HD I/Os                                    | 0                             | 0            | 0            | 0            | 0            | 0            | 96     |
| GTY 32.75Gb/s Transceivers                                   | 40                            | 80           | 80           | 120          | 96           | 128          | 80     |
| GTM 58Gb/s PAM4 Transceivers                                 | _                             | -            | -            | -            | -            | -            | -      |
| 100G / 50G KP4 FEC                                           | -                             | -            | -            | -            | -            | -            | -      |
| Extended <sup>(2)</sup>                                      | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2  |
| Industrial                                                   | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | _      |

|                                             | NX.<br>RTEX.<br>UlitraSCALET |              |              |              |              |              |       |
|---------------------------------------------|------------------------------|--------------|--------------|--------------|--------------|--------------|-------|
| Device Name                                 | VU3P                         | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P |
| System Logic Cells (K)                      | 862                          | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938 |
| CLB Flip-Flops (K)                          | 788                          | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172 |
| CLB LUTs (K)                                | 394                          | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086 |
| Max. Dist. RAM (Mb)                         | 12.0                         | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4  |
| Total Block RAM (Mb)                        | 25.3                         | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9  |
| UltraRAM (Mb)                               |                              |              |              |              |              |              | 0     |
| DSP Slices                                  | Dedic                        | atod ma      | sth fund     | rtions       |              |              | 10    |
| Peak INT8 DSP (TOP/s)                       | Deule                        |              |              |              |              |              | 4     |
| PCle <sup>®</sup> Gen3 x16                  | -                            | -            | -            | v            | 3            | -            | J     |
| PCIe Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                            | -            | -            | -            | -            | -            | 8     |
| 150G Interlaken                             | 3                            | 4            | 6            | 9            | 6            | 8            | 0     |
| 100G Ethernet w/ KR4 RS-FEC                 | 3                            | 4            | 6            | 9            | 9            | 12           | 0     |
| Max. Single-Ended HP I/Os                   | 520                          | 832          | 832          | 832          | 624          | 832          | 1,976 |
| Max. Single-Ended HD I/Os                   | 0                            | 0            | 0            | 0            | 0            | 0            | 96    |
| GTY 32.75Gb/s Transceivers                  | 40                           | 80           | 80           | 120          | 96           | 128          | 80    |
| GTM 58Gb/s PAM4 Transceivers                | -                            | -            | -            | -            | -            | -            | -     |
| 100G / 50G KP4 FEC                          | -                            | -            | -            | -            | -            | -            | -     |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 |
| Industrial                                  | -1 -2                        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -     |

|                                             | INX.<br>IRTEX.<br>UltraSCALET |              |              |              |              |              |            |
|---------------------------------------------|-------------------------------|--------------|--------------|--------------|--------------|--------------|------------|
| Device Name                                 | VU3P                          | VU5P         | VU7P         | VU9P         | VU11P        | VU13P        | VU19P      |
| System Logic Cells (K)                      | 862                           | 1,314        | 1,724        | 2,586        | 2,835        | 3,780        | 8,938      |
| CLB Flip-Flops (K)                          | 788                           | 1,201        | 1,576        | 2,364        | 2,592        | 3,456        | 8,172      |
| CLB LUTs (K)                                | 394                           | 601          | 788          | 1,182        | 1,296        | 1,728        | 4,086      |
| Max. Dist. RAM (Mb)                         | 12.0                          | 18.3         | 24.1         | 36.1         | 36.2         | 48.3         | 58.4       |
| Total Block RAM (Mb)                        | 25.3                          | 36.0         | 50.6         | 75.9         | 70.9         | 94.5         | 75.9       |
| UltraRAM (Mb)                               | 90.0                          | 132.2        | 180.0        | 270.0        | 270.0        | 360.0        | 90.0       |
| DSP Slices                                  | 2,280                         | 3,474        | 4,560        | 6,840        | 9,216        | 12,288       | 3,840      |
| Peak INT8 DSP (TOP/s)                       | 7.1                           | 10.8         | 14.2         | 21.3         | 28.7         | 38.3         | 10.4       |
| PCle <sup>®</sup> Gen3 x16                  | 2                             | 4            | 4            | 6            | 3            | 4            | 0          |
| PCle Gen3 x16/Gen4 x8 / CCIX <sup>(1)</sup> | -                             | -            | -            | -            | -            | -            | 8          |
| 150G Interlaken                             | 3                             | 4            | 6            | 9            | 6            | 8            | 0          |
| 100G Ethernet w/ KR4 RS-FEC                 |                               |              |              |              |              |              |            |
| Max. Single-Ended HP I/Os                   | Evtorr                        | val conr     | octivity     |              | Ethorno      | st CDIC      |            |
| Max. Single-Ended HD I/Os                   | LALCII                        |              | ιστινιι      |              | LUICING      | εί, αγιο     | <b>'</b> ] |
| GTY 32.75Gb/s Transceivers                  |                               | 00           | 00           | 120          | 50           | 120          | 00         |
| GTM 58Gb/s PAM4 Transceivers                | -                             | -            | -            | -            | -            | -            | -          |
| 100G / 50G KP4 FEC                          | -                             | -            | -            | -            | -            | -            | -          |
| Extended <sup>(2)</sup>                     | -1 -2 -2L -3                  | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2 -2L -3 | -1 -2      |
| Industrial                                  | -1 -2                         | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -1 -2        | -          |

## But how do you configure an FPGA ?

```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
```

```
wire out1, out2;
assign out1 = ~(A & B);
assign out2 = out1 & B;
always @(posedge clk)
begin
    r_out1 <= out1;
    r_out2 <= out2;
end
```

input A, B, clk; define the circuit inputs



```
input A, B, clk;
output r_out1, r_out2;
define the circuit outputs
```



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
define output registers (flip-flops)
```



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
wire out1, out2;
```



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
wire out1, out2;
assign out1 = ~(A & B);
assign out2 = out1 & B;
```



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
```

```
wire out1, out2;
assign out1 = ~(A & B);
assign out2 = out1 & B;
```

always @ ( clk) Clk)
Clk)
Clk



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
```

```
wire out1, out2;
assign out1 = ~(A & B);
assign out2 = out1 & B;
```

always @ (posedge clk) specify clock type

#### **Flip-Flop Operation**



```
input A, B, clk;
output r_out1, r_out2;
reg r_out1, r_out2;
```

```
wire out1, out2;
assign out1 = ~(A & B);
assign out2 = out1 & B;
always @(posedge clk)
begin
    r_out1 <= out1;
    r_out2 <= out2;    Connect combinational
    logic to sequential logic
```

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential



a: .long 1 b: .zero 4

out\_1: .zero 4 Out\_2: .zero 4

RTL – Register Transfer Level "A description of the logical circuit that transfers data

between registers"

RAM

Registers

## Synthesis: "Convert HDL code (Verilog) into circuit description (netlist)"

A Program "A sequence of instructions given to a CPU to get a result" 2[rip], ea

### Success!

## Not quite!





## **Placement and Routing:** "Place logic elements onto the array and route connections between them"










## Place and Route is NP-Hard With just 9 LUTs, there are 36 different ways to lay out the same circuit

**Place and Route is NP-Hard** With just 9 LUTs, there are 36 different ways to lay out the same circuit – real FPGAs can have 2M+ LUTS!!



Placement uses random allocations and successive refinement to try to find good placement solutions



The same HDL code, can create totally different placements.

# Successful placement does not guarantee a successful <u>result</u>







# Timing Closure "meeting timing"

# "The delay through logic and routing must not exceed the clock period.

If it does there is too much logic squeezed between flip-flops, or too long routing paths across the chip (or both)"

### Success!

#### Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Program

#### Machine Code

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

Compiler

a: .long 1
b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out\_1[rip],

#### Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Compiler

#### a: .long 1 b: .zero 4 out\_1: .zero 4 out\_2: .zero 4 mov eax, DWORD PTR a[rip] mov edx, eax and edx, DWORD PTR b[rip] not edx

Machine Code

not edx
mov DWORD PTR out\_1[rip],



#### CPU/GPU



Program

Machine Code

CPU/GPU

#### **Software programming:** Executing a sequence int out\_1; out\_2; out\_2 = out\_1 & b; of predefined instructions

mov DWORD PTR out\_1[rip]

Program

out\_1 = ~(a & b); out 2 = out 1 & b;

#### Compiler

Machine Code



#### CPU/GPU



#### RTL

input A, B, clk; output r\_out1, r\_out2; reg r\_out1, r\_out2; wire out1, out2; assign out1 = ~(A & B); assign out2 = out1 & B;

always @(posedge clk) begin r out1 <= out1;</pre>

r\_out2 <= out2;</pre>

end

Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Compiler

b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out 1[rip],

Machine Code



#### RTL

input A, B, clk; output r\_out1, r\_out2; reg r\_out1, r\_out2;

wire out1, out2; assign out1 = ~(A & B); assign out2 = out1 & B;

always @(posedge clk) begin

end

```
r_out1 <= out1;
r_out2 <= out2;
```

Synth.



Netlist

Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Machine Code

a: .long 1
b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out\_1[rip],



#### CPU/GPU



#### RTL

input A, B, clk; output r\_out1, r\_out2;

reg r\_out1, r\_out2; wire out1, out2;

assign out1 = ~(A & B); assign out2 = out1 & B;

always @(posedge clk) begin

```
r_out1 <= out1;
r_out2 <= out2;
```

Netlist

Compiler



Synth.

Program

out 1 =  $\sim$ (a & b); out 2 = out 1 & b;

#### Machine Code



Timing

Closure

CPU/GPU

#### RTL



input A, B, clk; output r\_out1, r\_out2; reg r\_out1, r\_out2;

wire out1, out2;

Synth.

assign out1 =  $\sim$ (A & B); assign out2 = out1 & B;

always @(posedge clk) begin

end

```
r_out1 <= out1;</pre>
r out2 <= out2;
```

Compiler

Netlist

Place & Route

Bitstream





Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Machine Code

a: .long 1
b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out\_1[rip],



#### CPU/GPU



#### RTL



Compiler



Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Machine Code

a: .long 1
b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out\_1[rip],



#### CPU/GPU



#### RTL

#### Netlist

Compiler



Compiler

Program

int a = 1; int b = 0; int out\_1; int out\_2; out\_1 = ~(a & b); out\_2 = out\_1 & b;

#### Machine Code

a: .long 1
b: .zero 4
out\_1: .zero 4
out\_2: .zero 4
mov eax, DWORD PTR a[rip]
mov edx, eax
and edx, DWORD PTR b[rip]
not edx
mov DWORD PTR out\_1[rip],



### CPU/GPU



Program

Machine Code

#### CPU/GPU





Program

Machine Code

CPU/GPU

#### Software programming: Executing a int out\_2; of\_predefined instructions Level of or predefined instructions of edge predefined instructions of edge predefined instructions Degree prede



### One more thing...

Integrated FPGA development environment optimized

for low latency applications on Nexus SmartNIC and

Nexus 3550-F swithces.

#### 1. FDK-PE Performance Edition

Optimized for Ultra Low latency applications (e.g. HFT)

#### 1. FDK-PE Performance Edition

Optimized for Ultra Low latency applications (e.g. HFT)

#### 2. FDK-EV Evaluation Edition

Freely available ULL optimized (2hr operation limit)

#### 1. FDK-PE Performance Edition

Optimized for Ultra Low latency applications (e.g. HFT)

#### 2. FDK-EV Evaluation Edition

Freely available ULL optimized (2hr operation limit)

#### 3. FDK-FE – Free Edition

Optimized for network acceleration. Freely available.

#### 1. FDK-PE Performance Edition

Optimized for Ultra Low latency applications (e.g. HFT)

#### 2. FDK-EV Evaluation Edition

Freely available ULL optimized (2hr operation limit)

3. FDK-FE – Free Edition



Optimized for network acceleration. Freely available.

# Tick the box or speak to our sales team to find out more!

ılıılı cısco