Fully Reconfigurable DSP: As Fast As Hardwired At ~2x Area/Power

Adapt DSP to changing models at a lower cost than using FPGA.


Today if you want high performance DSP you have three choices:

  1. Hardwire your function – zero flexibility
  2. Use DSP IP based on VLIW
  3. Use FPGAs with DSP MACs or math engines

What we hear from customers is that there is a growing need for very fast and very flexible DSP, which hardwired solutions can’t address.

And that the fastest solutions are FPGAs, but they are big, high power and high cost: the customers are only using the MAC/math portion of the FPGA but having to pay for the power and cost of the rest of the unused logic.

VLIW based DSP IP is not as fast as FPGA but costs less and uses less power but has a large area/power premium compared to hardwired.

What customers want is fully reconfigurable DSP IP that is as close as possible to the area and power of a hardwired design.

InferX DSP: 100% programmable, 80% hardwired

Fig. 1: InferX compute tile.

InferX is 80% hardwired: almost all of the datapath is hardwired. But it is 100% reconfigurable because the 16 tensor processors are in a reconfigurable interconnect, allowing the arrangement of tensor processors to be optimized for each function, and eFPGA is used as the control plane to manage operation at high speed with fine granularity. eFPGA can also be used to implement new operators, which pop up all the time as models continue to evolve. Unlike fully hardwired solutions, having eFPGA means you can always adapt to changing models.

Customers tell us that InferX is not much bigger than a fully hardwired DSP circuit that only does one task, but InferX can be reconfigured on the fly to do any DSP function. Reconfiguration happens in microseconds.

And InferX can execute DSP operations with very high throughput as shown in the following table where we use as an example 5nm and show throughput using 1 tile and 8 tiles (we can deliver arrays, with AXI bus interface, for any rectangular arrangement of tiles):

Table 1: InferX DSP throughput, TSMC N5 – note performance increases linearly with more tiles.

InferX DSP software

We have a team who has been writing high speed “Softlogic” (Verilog/RTL code) for years. Our team writes the Softlogic that executes the DSP operations on InferX.

You program InferX using Simulink at a higher level. You can have more than one DSP operation running and operations can feed data from one to the next in streaming mode with no buffering.

Here is a high-level overview of coding a Complex 4K FFT:

Fig. 2: Coding a Complex 4K FFT.

For a dynamic system that changes its operational mode over time, you can provide multiple Simulink models and switch between them on the fly. This would enable you to run 1K FFTs over a few blocks, and then switch to 4K FFTs for a few blocks, while only being offline for a few microseconds during the transition. This capability provides significantly more options for meeting the needs of a given application.

In short, InferX DSP gives you super high DSP performance that you can reconfigure at any time as needs, standards, and algorithms evolve at much lower power and cost than FPGA or existing DSP IP. More information is available under NDA. Contact us at [email protected].

Leave a Reply

(Note: This name will be displayed publicly)