PCIe In High-Performance FPGAs

The value of high performance computing applications depends upon how fast data can be transferred.

popularity

In today’s world, when the entire computing industry is talking about high-performance and high-speed applications using FPGAs, what are the factors that can assure such performance and speed? The value and success of today’s high performance computing applications in the areas of DNA Sequencing, High Frequency Trading (HFT) and Encryption/Decryption are predicated upon how fast data can be transferred from device to device.

In this blog, I am going to discuss the most popular data transfer protocol, called PCIe. Why do most companies depend upon it to get best data throughput? How does it differ from PCI? How is it used in an FPGA? What is the future of PCIe? Let’s find out!

The PCI bus was developed for high parallel data transfer applications. But over time, the demand for more devices and I/Os increased. With that, the PCI bus eventually turned out to be slow, and something better with a different bus topology was required.

Enter PCIe. It’s a complete overhaul from the previous PCI bus protocol. There are different ways through which PCIe differs from PCI. First, it has a switch architecture, while PCI has a multi-drop bus structure. Another major difference is the fact that PCIe is a device sending data packets to another device. Take a look at Figure 1 shown below – a packet comprises a Header and a Payload. The Header carries control information while the Payload has the data to be transferred from one device to another. This is similar to networking concepts. A major disadvantage of using PCI was that there were high chances of data inaccuracies (if parallel buses would have stale data because of bus capacitance affecting one another). PCIe completely gets rid of that issue by providing the centralized switch architecture. PCIe also has a CRC error-detecting code layer as part of it. The receiving device makes sure that the error detecting code is correct in order to measure the credibility of the data received. PCIe is also pluggable – meaning devices can be added or removed from the system while it is running.


Figure 1: PCIe Top Level Structure

Now, let’s understand the multiple layers of PCIe. PCIe is known as a protocol, as it follows a different set of rules between the sending and receiving party. PCIe is made up of four different layers: the Physical Layer, the Link Layer, the Transaction Layer, and the Software Layer.

The Physical Layer is architecturally the lowest layer. It is used to move data bits from sender to receiver over simplex or duplex lanes. The number of lanes in each direction (sender and receiver) is the same. Basically, the Physical Layer deals with bit-to-bit transmissions.

The Link Layer deals with packet transmissions. The Header and Payload given to the Link Layer by the Transaction Layer are adjoined by a random sequence number and an error correcting code also known as CRC. It is generated by performing random algorithm based logic calculations. At the time the packet is received, the receiver performs the same computation on the Header and the Payload. Thus, it compares the data. The data received must agree with the data sent. The Transaction Layer handles the bus actions.

The Software Layer interfaces the PCIe system architecture to the host operating system. The Software Layer of PCIe provides the backward compatibility that helps maintain the synchronization between different systems.

PCIe is now quite common in FPGA boards for various high-performance computing applications. For example, in data centers, it is important to maximize the performance while minimizing the power consumption. Today, FPGA-based acceleration platforms include PCIe-based programmable acceleration cards, such as our HES-XCVU9P-QDR for HFT applications.


Figure 2: HES-XCUV9P-QDR HFT Board

In order to achieve the highest performance and throughput, PCIe IP blocks are generally developed using RTL. Since PCIe is expandable, various IP vendors are able to manipulate the system architecture of PCIe and optimize them based on an application’s specific needs. There are several IP vendors who develop PCIe IP blocks in order to achieve max throughput, including our partner Northwest Logic – together we delivered an FPGA board based on Ultrascale+ with PCIe Gen 3 solution and high-performance-gather DMA support.

Now, debugging the hardware on FPGA using an oscilloscope will take a lot of time, so the best way to debug your design is to use a high-performance simulator. You would need advanced debugging tools such as Advanced Dataflow, Xtrace to find out unknown values, and Waveform Viewer. Check out Aldec’s high performance advanced verification simulator Riviera-PRO.

While PCIe applications are quite common due to scalability and flexibility, is PCIe replaceable? Yes it is! There are many graphical game developers who are always looking to design games and applications to look more and more realistic. That can be achieved only by passing more data from their applications to the VR headset. That requires faster data interfaces.

Although PCIe offers great speed, several other data transfer protocols and standards are considered as well. There are already other interface standards in the market being looked at by the industry. RapidIO, HyperTransport, and Mobile Industry Processor Interface are such examples, and wide adoption of these standards will speed up the innovations in the computing industry. These new standards will, of course, require new hardware or even ASICs that are yet to be developed. Hence, PCIe will lead for some more time.



Leave a Reply