Lines blur as processors are added into traditional FPGAs, and programmability is added into ASICs.
FPGAs are blinged-out rockstars compared to their former selves. No longer just a collection of look-up tables (LUTs) and registers, FPGAs have moved well beyond into now being architectures for system exploration and vehicles for proving a design architecture for future ASICs.
This family of devices now includes everything from basic programmable logic all the way up to complex SoC devices. And in a variety of application areas—including AI for automotive and other applications, enterprise networking, aerospace, defense, and industrial automation—FPGAs enable chipmakers to implement systems in a way that can be updated when necessary. That flexibility is critical in new markets where protocols, standards and best practices are still evolving, and where ECOs are required to remain competitive.
This is the rationale bethind Xilinx’s decision to add an Arm core to its Zynq FPGA to create an FPGA SoC, said Louie de Luna, director of marketing at Aldec. “On top of that, vendors have improved the tool flow. That has created a lot of interest in Zynq. Their SDSoC development environment looks like C, which is good for developers because applications typically are written in C. So they put in software functions and allow the user to allocate those functions to hardware.”
Xilinx’s Zynq-7000 SoC. Source: Xilinx
Some of these FPGAs are not just SoC-like. They are SoCs in their own right.
“They may contain multiple embedded processors, specialized compute engines, complex interfaces, large memories, and more,” said Muhammad Khan, product specialist for synthesis verification at OneSpin Solutions. “System architects plan for and use the available resources of an FPGA just as they do for an ASIC. Design teams use synthesis tools to map their SystemVerilog, VHDL, or SystemC RTL code into the base logic elements. For much of the design process, the differences are diminishing between efficiently targeting an FPGA and targeting an ASIC or full-custom chip,”
Ty Garibay, CTO of ArterisIP is well acquainted with this evolution. “Historically, Xilinx started down what became the Zynq path in 2010, and they defined a product that was going to essentially incorporate the hard macro of an Arm SoC into a corner of an existing FPGA,” he said. “Altera then hired me to do essentially the same thing. The value proposition was that an SoC subsystem was something many customers would want, but because of the nature of SoCs and especially processors, they don’t fit synthesis well onto an FPGA. Embedding that level of functionality into the actual programmable logic was prohibitive, as it used almost the whole FPGA just for that function. But it could be put in as a small or trivial part of the overall FPGA chip as a hard function. You gave up the ability to have truly reconfigurable logic for that SoC, but it was programmable as a software, so it changes function in that way.”
That meant it was possible to have a software programmable function, a hard macro and then a hardware-programmable function in the fabric and they could work together, he said. “There were some pretty good markets for that, especially in low-cost automotive control—places where there was traditionally a medium-performance microcontroller-type device next to the FPGA, anyway. The customer would just say, ‘I’m just gonna roll that whole function into the hard macro on the FPGA die to reduce board space, reduce BOM, lower the power.’”
This fit in with the evolution of FPGAs over the last 30 years, whereby the original FPGAs were just programmable fabric with a bunch of I/Os. Over time, memory controllers were hardened in, along with SerDes, RAM, DSPs and HBM controllers.
“FPGA vendors have been continuing to grow the die, but also continuing to add more and more hard logic that is deemed to be generally usable by a significant percentage of the customer base,” Garibay said. “What’s happening today is an extension of that into the software programmable side of it. Most of the things that were added before this Arm SoC were different forms of hardware, mostly to do with I/O, but also DSPs that it made sense to try to just save programmable logic gates by hardening them because there is enough planned utility.”
A matter of perspective
This essentially has turned the FPGA into a Swiss army knife of design possibilities.
“If you roll back time, it was just a bunch of LUTs and registers, instead of gates,” said Anush Mohandass, vice president of marketing and business development at NetSpeed Systems. “They had a classical problem. If you take a general-purpose anything and compare it to an application-specific version of it, the general-purpose compute will give a lot more flexibility, while the application-specific one will give some performance or efficiency benefits. Xilinx and Altera tried to marry this more and more, where they took note that practically every FPGA customer had a DSP and some form of compute. So they put in Arm cores, they put in DSP cores, they put in all of the different PHYs and commonly used stuff. And they hardened it, which makes it a lot more efficient, and the performance profile becomes better.”
These new capabilities have opened the door for FPGAs to play a big role in a variety of new and existing markets.
“From a market standpoint, you can see that FPGAs are definitely headed into the SoC market,” said Piyush Sancheti, senior director of marketing at Synopsys. “There are the economics of whether you do an FPGA or a full-blown ASIC. Those lines are beginning to blur, and we certainly see that more and more companies—especially in certain markets—are headed where the economics of doing an FPGA are better for production.”
Historically, FPGAs had been used for prototyping, but for production usage it was limited to markets like aerospace, defense and communications infrastructure, Sancheti said. “Now the market is expanding into automotive, industrial automation, and medical devices.”
AI, a booming market for FPGAs
Some of the companies embracing FPGAs are systems vendors/OEMs looking to optimize performance for their own IP or AI/ML algorithms.
“They wanted to build their own chips, and for many of them starting out and doing an ASIC can be a bit intimidating,” said NetSpeed’s Mohandass. “They also just might not want to spend the $30 million in wafer costs to get a chip out. For them, an FPGA is a valid entry point where they are have unique algorithms, their own neural net that they want to project and see if it gives them the performance they are looking for.”
The challenge in AI application currently is quantization, said Stuart Clubb, senior product marketing manager for Catapult HLS synthesis and verification at Mentor, a Siemens Business. “What kind of network is needed? How do I build that network? What’s the memory architecture? You start off with the networks such that even if you just have a few layers and you’ve got a lot of data going in a few coefficients, it very quickly spins around to millions of coefficients, and the memory bandwidth there is becoming quite frightening. Nobody really knows what the right architecture is. And if the answer isn’t known, you’re not going to jump in and build an ASIC.”
In the enterprise networking space, the most common issue is that the crypto standards seem to be changing all the time. “Instead of trying to build an ASIC, you put it in an FPGA, and make the crypto engine better,” said Mohandass. “Or, if you’re doing any kind of packet processing in the networking side of the world, FPGAs still give you a lot more flexibility and a lot more programmability. That’s where the flexibility comes into play, and they’ve used that. You can still call that heterogeneous computing and it still looks like an SoC.”
New rules
With the new breed of FPGA SoCs, the old rules don’t apply anymore. “Specifically if you’re debugging on the board, you’re doing it wrong,” Clubb pointed out. “While debugging on the board is viewed as a lower-cost solution, this goes back to the early days of being able to say, ‘It’s programmable, you can stick a scope across it, you can go look and see what’s happening. But now saying, ‘If I find a bug I can go fix it, program a new bit stream within a day and get it back on the board and then find the next bug,’ that’s just insane. That is a mentality you see a lot in the area where the employee’s time is seen as not a cost. Management won’t buy the simulator or the system-level tools or the debugger because, ‘I’m just paying the guy to get the job done and I’ll scream at him until he works harder.’”
This behavior is still common, given that there are more than enough companies with the attitude of axing the bottom 10% every year to keep everybody on their toes, he said.
However, FPGA SoCs are truly SoCs, requiring the same rigorous design and verification methodologies. “The fact that the fabric is programmable doesn’t really impact the design and verification,” Clubb said. “If you make an SoC, yes, you can do what I’ve heard some of my customers call ‘LEGO’ engineering. It’s the block diagram approach. I need a processor, a memory, a GPU, some other bits and pieces, a DMA memory controller, WiFi, USB and PCI. These are all just ‘LEGO’ blocks that you assemble. The trouble is that you must verify that they work, and that they work together.”
Still, FPGA SoC system developers are quickly catching up with their SoC brethren where verification methodologies are concerned.
“They’re not as advanced as the traditional silicon SoC developers that deal with the mindset of, ‘This is going to cost me $2 million, so I’d better get it right,’ because the cost of failure [with FPGAs] is lower,” said Clubb. “But if you spend $2 million developing the FPGA and you’ve got it wrong, and now you’re going to spend three months fixing the bugs, there are still issues to contend with. How big is the team? How much is it going to cost? What’s the penalty at time to market? These are all very much harder costs to clearly quantify. If you’re in a consumer space, it’s almost somewhat unlikely with an FPGA that you’re really concerned about making it in time for Christmas, so there’s a bit of a different priority. The total cost and risk of doing an SoC in custom silicon and pulling the trigger and saying, ‘This is my system, I’m done,’ you’re not seeing so much of that. As we all know, the industry is consolidating and there are fewer big players doing big chips. Everybody else has to figure out a way to do it, and these FPGAs are delivering that.”
A new tradeoff option
It is not uncommon for engineering teams to design with the intent to leave their options open for a target device, Sancheti said. “We see a lot of companies create RTL and verify it, almost being agnostic to whether they’re going to go FPGA or ASIC because a lot of times that decision can change. You may start with an FPGA, and if you hit a certain volume, the economics may be in favor of commissioning an ASIC.”
This is particularly true for the AI application space today.
“There has been a progression of technologies being employed for accelerating AI algorithms,” said Mike Gianfagna, vice president, marketing at eSilicon. “Obviously, AI algorithms have been around a long time, but now all of a sudden we’re getting more sophisticated in how to use them, and the ability to run them at near real-time speeds is the magic here. It started with the CPUs and then it moved to the GPU. But even a GPU is a programmable device, so there’s some generality that one size fits all. While the architecture is good at parallel processing, because that’s what graphics acceleration was all about, that’s convenient because that’s what AI is all about. To a large extent it’s good, but it’s still kind of a general-purpose approach. So you can get a certain level of performance and power footprint. Some people are moving to FPGAs next because you can target the circuitry better than you can with a GPU, as well as get a step up in performance and power efficiency. ASIC is the ultimate in power and performance because there you have a fully custom architecture that does exactly what you need, no more, no less. That’s clearly the best.”
AI algorithms are tricky to map into silicon because they’re in a state of almost constant change. So doing a full-custom ASIC at this point is not an option, because it’s out date by the time the silicon comes. “FPGAs are really nice for that because you can reprogram them, so the investment in that really expensive chip is not lost,” Gianfagna said.
Here, there are certain custom memory configurations, and certain subsystem functions like convolution and transpose memory that are used over again, so while the algorithms may change, there are certain blocks that don’t change and/or get used over and over again, he added. With this in mind, eSilicon is developing a chassis with some software analysis to look at AI algorithms. The goal is to be able to choose the best architecture for a particular application more quickly.
“FGPAs give you that flexibility of changing the machine or the engine because you may run into a new kind of a network, and committing to an ASIC is risky in the sense you may not have the best support for it so you could have that flexibility,” said Deepak Sabharwal, vice president of IP engineering at eSilicon,. “However, FPGAs are always going to be constrained both in capacity and performance, so you will not be able to really get to production-level specs with an FPGA. You can play with it and group things, but ultimately you’re going to have to get to an ASIC.”
Embedded LUTs
Another option that has gained footing in the past couple years is the embedded FPGA, which builds programmability into an ASIC versus adding the performance and power benefits of an ASIC into an FPGA.
“An FPGA SoCs is still predominantly an FPGA with a relatively small amount of the die area for the processing,” said Geoff Tate, CEO of Flex Logix. “In the block diagram, the scale looks different, but in the actual die photos, it’s mostly FPGA. But there’s a class of applications and customers where the right ratio between the FPGA logic and the rest of the SoC is to have a smaller FPGA, giving them RTL programmability in a much more cost-effective die size.”
This approach is finding traction in such areas as aerospace, wireless basestations, telecommunications, networking, automotive, and vision processing, and particularly AI. “The algorithms change so fast that chips are almost obsolete by the time they come back,” said Tate. “And with some embedded FPGA, it can give them the ability to iterate their algorithms more quickly.”
This is especially evident in the transition from cars that are driven to cars that are increasingly autonomous. While the issues of failure and aging have received a lot of attention, the challenge is to maintain “graceful degradation,” according to Raymond Nijssen, vice president of systems engineering at Achronix. “Performance and quality change over time, and so do the standards. Requirements that cars need to recognize a child cross the street are relatively recent. No one knows how the regulations will change, or how you can test for that. How do you test for standards that aren’t known yet.”
In this case, programmability becomes essential to avoid re-doing entire chips or modules, said Nijssen.
Debug designed in
That’s only part of the picture, though. As with all SoCs, understanding how to debug these systems, and building in instrumentation, can help identify issues before they turn into major problems.
“As system FPGAs become more SoC-like, they need the development and debug methodologies you expect in an SoC,” said Rupert Baines, CEO of UltraSoC. There is a (perhaps naïve) belief that because you can see anything in FPGA it is easy to debug. That is sort of true at the bit-level, with a waveform viewer, but it doesn’t hold true when you get to system level. The latest big FPGAs clearly are system-level. At that point, the waveform-level view that you get from a bit-probe- type arrangement is not terribly useful. You need a logic analyzer, a protocol analyzer, as well as good debug and trace capabilities on the processor cores themselves.”
The size and complexity of an FPGA mandate a verification process that is similar to that for ASICs. Sophisticated UVM-based testbenches support simulation, often backed by emulation, as well. Formal tools are playing a key role here, ranging from automatic design checks through assertion-based verification with a range of powerful solvers. While it is true that FPGAs can be changed much more quickly and inexpensively than ASICs, the difficulty of detecting and diagnosing bugs in a large SoC means that thorough verification must be performed before entering the bring-up lab, OneSpin’s Khan said.
In fact, in one area the verification demands for an FPGA SoC are arguably greater than for an ASIC—equivalence checking between the RTL input and the post-synthesis netlist. The elaboration, synthesis, and optimization phases for an FPGA often make far more modifications to the design than a traditional ASIC logic synthesis flow. These changes can include moving logic across cycle boundaries, and implementing registers in memory structures. Thorough sequential equivalence checking is essential to ensure that the final FPGA design still matches the original designer intent in the RTL, Khan added.
On the tool side, there’s also room to optimize performance. “With embedded vision applications, a lot of which are written for Zynq, you may get 5 frames per second. But if you accelerate that in hardware, you may get 25 to 30 frames per second. That paves the way for new kinds of devices. The problem is that simulation and verification of these devices isn’t simple. You need integration between the software and the hardware, and that’s difficult. If you run everything in the SoC, that’s too slow. It can take five to seven hours per simulation. You can save that time if you co-simulate,” Aldec’s de Luna said.
Put simply, the same kinds of approaches that are used in complex ASICs are now being used in complex FPGAs. That is becoming all the more critical as these devices are used in functional safety types of applications.
“That’s where formal analysis comes in to make sure there are fault propagation paths and then verifying those paths,” said Adam Sherer, group director of marketing at Cadence. “Those things are very well suited for formal analysis. The traditional approaches in FPGA verification really make those types of verification tasks almost impossible. It’s still quite prevalent in FPGA design to assume that it’s very fast and easy to get to a hardware test, which runs at system speed, and do a very simple level of simulation just for a sanity check. Then you program the device, go in the lab and start running. That is a relatively quick path, except that observability and controllability in the lab is extremely limited. This is because it is only possible to probe based on pulling data from deep inside that FPGA to the pins so that you can see them on a tester.”
Dave Kelf, chief marketing officer at Breker Verification Systems, agrees. “This creates an interesting shift in the manner in which these devices are verified. In the past, smaller devices were verified as much as possible by loading the design onto the FPGA itself and running it in real time on a test card. With the advent of SoCs and software-driven design, it might be expected that this ‘self-prototyping’ style of verification might work well with software-driven techniques, and for some phases of the process it does. However, identifying issues and debugging them during prototyping is complex. Simulation is required for this earlier verification phase, and therefore SoC-style FPGAs look more and more like an ASIC. Given this two-phase process, commonality between them makes the process more efficient and includes common debug and testbenches. New advances such as Portable Stimulus will provide this commonality and, in fact, make SoC FPGAs far more manageable.”
Conclusion
Looking ahead, Sherer said that users are looking to apply the more rigorous processes now used in the ASIC world into the FPGA flow.
“There’s a lot of training and analysis, and they expect there to be more technology in the FPGAs for debug, for that level of support,” he said. “The FPGA community tends to be behind that state of the art and tends to use very traditional methods, so they need training and awareness in the space, planning and management, requirements traceability. Those elements that are coming from the SoC flow are absolutely necessary in FPGA, and it’s not as much that the FPGA itself is driving it, but those industrial standards in the end applications are driving it. It’s a retooling and re-education for engineers that have been working in the FPGA context.”
The lines are blurring between ASICs and FPGAs, driven by applications that demand flexibility, system architectures that increasingly combine programmability with hard-wired logic, and tooling that is now being applied to both. And that trend is unlikely to change anytime soon because many of the new application areas that demand these kinds of combinations are still in their infancy.
Related Stories
Embedded FPGAs Come Of Age
These devices are gaining in popularity for more critical functions as chip and system designs become more heterogeneous.
Tech Talk: EFPGA Verification
How embedded FPGAs compare to a discrete FPGAs and ASICs.
Tech Talk: EFPGA Timing
What’s unique about timing for an embedded FPGA.
Tech Talk: EFPGA Programming
How working with embedded FPGAs differs from ASICs.
Tech Talk: FPGA RTL Checking
How to make sure the RTL in an FPGA matches what you developed.
Tech Talk: EFPGA Density
How to improve density in programmable logic.
After decades at it, the FPGA guys have failed to do a fast code-to-FPGA methodology, now they are going to be doing AI? – I think not. They don’t even know how to use their own hardware to do FPGA verification or accelerate their own tools (as far as I can tell).