The Case For Combining CPUs With FPGA Fabrics

How to continue driving performance gains as the benefits of scaling fall off.


Given that the industry is beginning to reach the limits of what can physically and economically be achieved through further shrinkage of process geometries, reducing feature size and increasing transistor counts is no longer achieving the same result it once did. Instead the industry is, quite rightly, focusing on fundamentally new system architectures and making better use of available silicon through radical rethinking of how tasks are achieved within each device. As we enter this new era in technology, the integration of the FPGA fabric with CPUs in the form of embedded FPGAs becomes an attractive solution.

Although both FPGAs and CPUs use a mix of memory and logic to hold and process data and instructions, there is an important fundamental difference between the two. A CPU is optimized for rapid context switching, versus an FPGA, that is slower to configure, but can emulate digital logic at speeds analogous to hard-wired circuits. Therefore, just as the CPU excels at performing varied tasks, the FPGA excels at performing repetitive (and particularly highly parallelized) tasks that repeat for thousands of cycles and are only occasionally redefined.

Market Indicators
There is clear evidence that the combination of CPUs and FPGA technology can drive real value through closer integration. The first example of this is Intel’s $16.7Bn acquisition of Altera which is being used to accelerate data center functionality. The second is Microsoft’s Catapult program which suggests that data center server compute capabilities can be doubled through the integration of FPGAs into each server to accelerate Bing Search, Azure and Microsoft 365. These examples show that the industry has begun to recognize the advantages of heterogeneous architectures of CPUs and FPGAs. Inevitably, these types of heterogeneous architectures will move onto the same device, with FPGA fabrics integrated into the ASIC as IP blocks.

Moving into the eFPGA Era
Achronix is now a major catalyst in this area, having already introduced embedded FPGA IP derived from its earlier family of standalone FPGAs known for its high performance and sophisticated routing architecture. The Speedcore™ eFPGA IP demonstrates many of the possible advantages of integrating FPGA fabric into CPUs/SoCs. The figure below shows how Speedcore IP can be integrated into an SoC subsystem.

Fig. 1: Speedcore eFPGA in an SoC subsystem

Because the eFPGA is on the same device, there’s no need for signals to go through SerDes and protocol encoding for PCIe, for example. As a result, latencies are an order of magnitude lower. Furthermore, SoCs with an eFPGA fabric are capable of higher performance than those with a standalone FPGA because of the higher bandwidth of on-chip interconnects.
Integration of FPGA fabrics have also shown to considerably reduced power. This power savings also extends to the system level, as the integration of FPGA fabrics on-chip removes some of the supporting components such as clock generators, passives, etc.

Unlike discrete FPGAs, which come in fixed ranges of size and performance, the mix between logic gates and memory in an eFPGA IP block can be defined by the customer, resulting in just the right amount of FPGA necessary to provide acceleration of the appropriate functions. This capability ensures that optimum CPU-size-to-FPGA-resource ratio is achieved within the SoC, optimizing silicon area, power consumption and cost.

Overall, there is little doubt that eFPGAs are going to represent a major architectural trend over the next couple of years. The advantages are simply too compelling.

Leave a Reply