Speeding Up High-Frequency Trading

How to slash latency and improve performance to win over the competition.


The High-Frequency Trading (HFT) industry has received a lot of attention during the last few years. HFT is all about speed and minimizing latency: the faster you can run trading strategies and algorithms for analyzing minute price changes and executing trade orders, the higher the probability to win over competition. So the competition in this area is very fierce with market players continuously investing in more powerful solutions which can trade in nanoseconds, hence the race to zero latency. The HFT firms that keep pace with technological innovations with engineering resource will have a competitive edge.

In order to minimize the network latency needed to be profitable; the HFT firms spend big lots of money on faster hardware and computing facilities closer to stock exchanges. When it comes to faster hardware, the solution often is to offload compute-intensive portions of trading functions to GPUs, FPGAs, or custom processors. Because of their architecture, FPGA’s can significantly speed up the processing of algorithmic models and transmission of data to the stock exchanges. Overall, these chips are probably inferior to standard processors. But when it comes to the parallel implementation of repetitive and simple tasks, FPGAs perform better than CPUs. Especially when it comes to offloading software tasks like TCP/IP processing to dedicated network cards that FPGA’s processing the data coming in from an Ethernet port.

I have known of this HFT world, but got to see it up close in May 2017 when I went to The Trading Show in Chicago, where lots of vendors come to show their tools and services for improving latency. I always thought it was just FPGAs and High performance network devices that are needed to do this, but there is a wide array of services and systems out there that help with every single step of the process. For example, dedicated Fiber Optic networks that run close to the Stock exchanges. Microwave Transmitters and Receivers that can send and receive data, and as the person selling it claims, is faster than fiber optic networks as the signal travels through air and not glass, and because of the fact that it does not have to zig-zag in air like it would do in a fiber optic cable.

Developing FPGA designs for HFT is not a trivial task. An advanced RTL simulator that can give you faster compilation and simulation time with advanced debugging features can be the key to ensuring highly-reliable FPGA designs for HFT.

At the Trading Show 2018 this year, we showed off the new HES-XCVU9P-QDR UltraScale+ board running Tamba Networks 10G Ethernet MAC IP Core. This board has been designed to deliver a proven Ultra-Low Latency (ULL) Ethernet solution that provides ~56ns Ethernet latency using QSFP28 channels. This solution is an FPGA implementation of a highly optimized FIFO + MAC + PCS running on the latest Xilinx UltraScale+ architecture. The HES-XCVU9P-QDR board contains a Xilinx Virtex UltraScale+ XCVU9P FPGA with 2x 100 Gb/s QSFP28 cages. The re-configurable FPGA combined with QDR-II+ or DDR4 memory modules provide high throughput for algorithm acceleration and data processing using the PCIe interface protocol. The PCIe x16 half-length low-profile board is 1U compatible and easily fits into enterprise rack systems for maximum performance density.

Leave a Reply

(Note: This name will be displayed publicly)