New Interconnect Makes eFPGA Dense And Portable

Innovation in reducing the number of metal layers required for eFPGAs.


FPGAs were invented over 30 years ago. Today they are much bigger and faster, but their basic architecture remains unchanged: logic blocks formed around LUTs (look-up-tables) in a sea of mesh (x/y grid) interconnect with a matrix of switches at every “intersection.”

One FPGA company executive once said they don’t really sell programmable logic, they sell programmable interconnect, because 70-80% of the fabric of an FPGA is the traditional mesh fabric which routes the signals programmably between all of the logic blocks. And as the FPGA gets bigger the amount of interconnection typically needs to grow to avoid routing problems (complexity scales with N2).

My co-founder of Flex Logix, Cheng Wang, told me when we first met, “I know a way to make a better FPGA,” because the interconnect they had devised could cut the FPGA fabric size almost in half (all other things like process choice, choice of full-custom/standard-cell, being equal). (We decided very early on that taking on giants was a bad idea and instead focused on using the invention for developing a market for embedded FPGA for integration into SoCs.)

While at UCLA, he and others built five different FPGA chips of increasing complexity where he invented a better FPGA interconnect. He filed a patent based on his work at UCLA, of which Flex Logix is the exclusive licensee. And since being at Flex Logix he has further improved the interconnect, which has been the subject of two patents issued to Flex Logix.

Cheng and others wrote a paper on their last FPGA chip done at UCLA that they presented at ISSCC 2014. It then won the Outstanding Paper award at ISSCC 2015, shortly after we had started Flex Logix. This award is typically won by giant companies like Intel and Bosch, not University PhDs.

Not only does the patented interconnect cut area, it reduces the number of metal layers needed. This turns out to be very valuable because there are often dozens of metal stacks in a given process node/variation: by using only 5, 6, 7 metal layers Flex Logix can design eFPGA IP which is compatible with almost all metal stacks. If we used a large number of metal layers, like FPGA chip companies, customers would have to adopt our metal stack OR we’d have to re-route designs, taking time and doing surgery on the GDS.

What is the new interconnect? It is a Boundary-Less Radix Interconnect Network. At first glance, it appears to be a hierarchical network which has been tried before, but it incorporates numerous improvements to improve spatial locality so as to cut area while at the same time maintaining performance.

And as it scales the complexity grows, but as a rough function of N*logN, which is much less than the rate of complexity increase for the traditional mesh.

More details are available in the paper by Cheng, et al. in the proceedings of ISSCC 2014 (it is copyrighted, so we can’t reproduce it here) and in the various patents filed by UCLA and Flex Logix.

When Cheng and others won the ISSCC Prize, we were contacted by technical executives at several FPGA companies who were intrigued. When we met with them their conclusion was that the interconnect was so different from what they used that it would cause them to have to re-do all their place-and-route software which was a huge investment they’ve made for more than a decade. Also, there was skepticism whether it would really perform as advertised in production quality silicon.

Since then we have implemented this new interconnect on TSMC 40, 28 and 16nm, and improved it with our “Gen 2” architecture on 16 and soon 28nm.

Data from one our competitors, who licenses eFPGA IP based on their FPGA chips, enables us to do a density comparison: we are within 5-10% of their density in terms of LUT4-equivalents/mm2, even though they use full-custom design whereas we use standard cells. The superior density of our interconnect allows us to match their density AND gives us superior portability: we use many fewer layers of metal, so we are compatible with most memory stacks, and since we use standard cells we can support multiple process variations within a node with one GDS (for example, 16FF+/16FFC/12FFC or 28HPC/HPC+) whereas eFPGA IP based on an FPGA chip has to be re-routed (if possible) for each metal stack and has to change their GDS to redesign full-custom circuitry to move process variations where we can use the same GDS.