Getting better density and performance for complex, frequently used blocks.
We work with a lot of customers designing eFPGA into their SoCs. Most of them have “random logic” RTL, but some customers have large numbers of complex, frequently used blocks.
We have found in many cases that we can help the customer achieve higher throughput AND use less silicon area with Soft Macros.
Let’s look at an example: 64×64 Multiply-Accumulate (MAC), below:
If you program this using Verilog and run through the synthesis tool, the tool will do a reasonable job: for 16nm EFLX eFPGA, the 64-bit MAC is generated using 20 DSP blocks (22×22 MACs) and 110 RBBs (blocks of 4 LUT4s each) with a period of 18ns worst case.
Instead, we can work with a customer to define a Soft Macro for their particular frequently used complex logic block which more optimally uses the resources available.
See below for an implementation of a 64×64 MAC optimized by our Solutions Architects based on our understanding of our eFPGA architecture and how to best map the 64-bit MAC on to it:
This Soft Macro 64×64 MAC achieves 16ns using 9 DSPs and 32 RBBs OR optionally 15ns using 12 DSPs and no RBBs.
The Soft Macro delivers a speed up of 10-20% with about half the resources!
This Soft Macro is easily instantiated in your Verilog code and recognized by the EFLX Compiler to be properly mapped instead of being synthesized by the synthesis tool.
Other algorithms have frequently used complex blocks: encryption/decryption, communications algorithms, blockchain and more. Soft Macros can improve density and performance for all of them.
Leave a Reply