Without productivity gains, design size and complexity would face huge headwinds. Those gains come from a diverse set of improvements.
Designs have become larger and more complex and yet design time has shortened, but team sizes remain essentially flat. Does this show that productivity is keeping pace with complexity for everyone?
The answer appears to be yes, at least for now, for a multitude of reasons. More design and IP reuse is using more and larger IP blocks and subsystems. In addition, the tools are improving, and more feedback loops are making continuous improvements.
Wilson Research and Mentor, a Siemens Business, just published the results of a verification survey1 for 2020. It tracks team sizes (figure 1), schedules, and the ability for teams to meet those schedules (figure 2).
Fig. 1: Mean peak number of engineers on an ASIC/IC project. Source: Mentor
Fig. 2: ASIC completion to project’s original schedule. Source: Mentor, a Siemens Business.
The data indicates that productivity has approximately matched the increase in complexity of designs since 2016. There appears to be a slight drop in the ability to meet schedules, but nothing statistically outside of the margin of error. How are teams managing to do this and where are the productivity gains coming from?
A lot can get hidden in average numbers. “If you are doing block-level design, I would agree that head counts are probably consistent from node to node,” says Jeremiah Cessna, product management director in the Custom IC & PCB Group of Cadence. “But when you look at leading-edge SoCs and systems, we see head count going up because there are more things to do.”
“If you think of a modern SoC design, and how much of that content is coming from reuse, it doesn’t necessarily mean that unique content is growing,” says Neil Hand, director of marketing for design verification technology at Mentor, a Siemens Business.
Total functionality continues to grow. “We definitely see sizes of designs growing,” says Joe Mallett, senior marketing manager for Synopsys. “They are increasing the number of gates because as you go down in silicon size, you can put more into the same cost, more functionality into the next generation for the same effective die size.”
The back end is impacted. “We can be more efficient if 80% is reused,” says Stephane Leclerc, design engineering director in the IP Group for Cadence. “Then we can focus on the new content, and that has been a recipe for a number of years. Increasing amounts are coming from reuse. On the back end, we are trying to build a methodology where we hide the growing complexity that goes from one node to another.”
Increased reuse
Nobody doubts the power of reuse. “Intellectual Property blocks are either built into a library for those inside of large companies, or if you’re a small company, you go outside and you buy it,” says Benoit de Lescure, CTO for Arteris IP. “Complexity is managed through a divide and conquer strategy. Companies are also using larger macro functions that you stitch together with the same amount of people. Today, you can buy a multiple CPU block, with Level 3 cache, and complex cache coherent interconnect. These have been designed to be easy to configure, and so you can create a very large CPU complex with 8 or 16 CPUs, and that becomes the macro functions you’re integrating.”
But there is less growth outside of the CPU cluster. “Each block doesn’t grow much,” says Cadence’s Leclerc. “The growth is often related to a standard. So you would have PCIe, DDR, HBM, where the growth is not significantly different from one node to the next.”
Over time, the standards become more complex. “Compliance to PCI Express Gen5, or DDR5 is way more complex than Gen1 or DDR1,” says Mentor’s Hand. “These standards continue to get more complex. That pushes more people to an IP model because it’s not practical for them to develop it themselves.”
This indicates that IP count is increasing. “10 years ago, you might only have 20 IP blocks,” says Synopsys’ Mallett. “Now you have 60 IP blocks. At the start you may have been re-using half of those, and now you’re re-using 70%. So, you’re gaining a little bit of traction in terms of IP reuse, being offset to some extent by how many IPs you’re integrating.”
Design methodology
As well as more functionality coming from reuse, the design methodology has been changing. “If you were to try to do a modern SoC as a completely flat integration, it wouldn’t be practical,” says Hand. “But doing a system of subsystems is practical. The increase in the number of blocks, the way you manage it, is hierarchical, just as we’ve always done. A block today was a chip several generations ago. A subsystem today was a chip maybe a generation ago. The tools to do software/hardware validation and verification have grown with that complexity, and fortunately the industry has been able to address that — which is why we have been able to keep up with the growing complexity.”
Knowing the tools also helps. “You close timing at the macro function level and then you have specific approaches to assemble those together on the die,” says Arteris’ de Lescure. “The tools for synthesis have evolved and they are more powerful. Plus, some vendors support cloud-based computing farms, so you have access to way more computing resources. If your tool can handle 4 million instances, the architect will subdivide the chip to ensure none of them goes beyond that limit. Then you give that to your compute farm, and they can close timing on all those macro functions in parallel. This was impossible 10 or 20 years ago, just because the computing power wasn’t available and people had to work in a more serialized fashion, or synthesize everything flat, to get the performance or the PPA they were looking for.”
Some problems rise to become system issues. “The complexity of use modes is increasing,” says Mallett. “You have switching of the modes where you have to deal with power domains. You’re getting reset problems or clock problems as you start powering up and down these different domains. That’s getting complex and continuing to rise in its complexity, just because more functional blocks are involved. And you’re having to worry more about the software use models that are going to interact with the hardware.”
Some of these issues can be dealt with using encapsulation. “Fifteen years ago, if you said NoC to a customer, they were suspicious,” says de Lescure. “They thought it would cost too much. But then they hit a wall, where they realized that doing everything flat would no longer work. They had to think more modular. They needed to break the problem into smaller pieces that could be managed one at a time. The NoC is a wonderful tool to stitch everything together. We are an integration productivity aid.”
New problems at every node
Every node sees new issues, and those result in an increase in the size and complexity of the design rules. “It wasn’t just width and spacings anymore,” says Jeremiah Cessna, product management director in the Custom IC & PCB Group at Cadence. “There were new end-of-line and span, and really esoteric rules that were causing problems. This was right at the inflection point when the industry switched from CMOS to finFET. Back in 2012, people were looking for every layout contractor they could find. There was a lot of custom automation around placers and routers, and modular generators and complex pcells that were helping to make people more productive, but it was not enough.”
That caused people to rethink the methodology. “Instead of pushing polygons around, was there a more intelligent way to do this? If we limit device sizes and set up rows, like they’ve done in digital, and if we embed that into the PDK enabling the setup, we can build that into the tools and the infrastructure,” Cessna says. “So that design scales with each node, and they don’t have to worry about all the complexity. Today, you do not see a huge hit when someone goes from 16nm down to 7nm, or from 7nm to 3nm. If they had done it the way they have always done it, pushing polygons, the number of contractors would have grown exponentially.”
Verification
Because verification consumes as much time as design, similar gains have to be made to verification productivity. Verification reuse has been slower in coming than design reuse. “Today, users trust the provider for verifying the function,” says de Lescure. “By trusting, you do verification once and you will reuse many times, that’s the key to increase productivity.”
An improvement in verification IP has also helped. “The time spent on verification is not expanding noticeably,” says Bipul Talukdar, director of applications engineering for SmartDV in North America. “Any number of factors could contribute to that stabilization, including availability of new tools and methodologies, such as synthesizable transactors that reduce the lag time when moving between simulation and emulation. Commercial viability of FPGA prototyping enables more verification since it is faster than simulation. Perhaps the biggest reason is the more widely available verification IP from third-party vendors, so verification engineers don’t need to develop it themselves.”
Standards help with that. “As the industry moves toward standards it means, from a verification perspective, they can leverage standard VIPs,” says Hand. “Big chunks of the design can be treated as good until they’re shown to be bad. You can assume they are correct. Verification IP has gotten significantly better, not only checking the protocol, but it also includes the notion of transactions. It has test plan, it has coverage models, it has protocol debugging built in. The level of complexity of the VIP has gone up, which means they can put it in their system, do their system or subsystem-level test, and they will get a notification if something unexpected happens. The modern methodologies for verification allow you to have, maybe not a greater level of trust, but a greater level of visibility into what is happening during system integration and validation.”
Sometimes doing things smarter is better than doing them faster. “The continuous adoption and advancements of formal verification techniques and solutions are delivering significant productivity gains,” says Sergio Marchese, technical marketing manager for Onespin Solutions. “It is not only about achieving sign-off on schedule, but also reducing the extra effort to deal with post RTL freeze or post-silicon bug analysis and fixing. These advancements have enabled constant-size teams to maintain the level of verification quality despite an increase in the complexity of the IP and SoCs they need to verify.”
Verification tools often allow problems to be seen much faster. “We invest a lot to lower the noise that comes from tools,” says Mallett. “In a large SoC, there are more domain crossings and reset crossings, and it’s more of a challenge for people. So you look at how to prequalify some of those and be able to do the verification at the block level, so that when you put everything together at the SoC level, you’ll have the issues that you need to address at the S-C level and not those from every single IP block.”
Tool improvements
Additional productivity gains comes from the EDA tools. “The NoC spans long distances on the chip,” says de Lescure. “We are working on topology synthesis, which will allow customers to simply give to the tool a floor plan, give to the tool a set of performance requirements, and the tool will will create a network on chip, including the pipelining required to close timing for that particular technology at that particular frequency. We are working to be physical aware, even at NoC design time. And this is kind of shift left, ensuring you take into account physical implementation problem as early as possible before you need to debug them in the back end, were it is tremendously difficult.”
Debug is often a congestion point in the process. “Debug platforms are becoming context-aware,” says Hand. “If you’re doing emulation, it treats them differently than if you’re doing simulation. If you’re looking at a protocol, it will give you a different representation than if you’re looking at RTL. You can increase efficiency for the users and productivity by being context aware. We’re giving them guidance on where they need to look. So it goes from being a dumb interface to being a guided or intelligent interface.”
And additional intelligence is being added everywhere. “There are innovations going into the tools that try to address productivity issues that end designers or verification engineers might be running into,” says Mallett. “We are now starting to see machine learning technologies coming into tools. One example is a static platform for CDC and RDC, integrated with noise reduction techniques through machine learning. If you address the causes very quickly, your affected errors come down very quickly. So the faster you can identify the actual cause of the problem, the more efficient your verification people can be with that technology. We’ve seen a fairly significant increase in productivity — just because it’s reducing the noise, before you ever get to plugging in the debug tool.”
Feedback loops
But perhaps the biggest productivity gains are coming from improved or new feedback loops being introduced into design and verification methodologies. “We want to pull stuff as early in the process as we can,” says Hand. “If we can do performance analysis on an early virtual prototype or a virtual hybrid prototype with hardware and software running, we can fix performance issues before it’s been committed to a design.”
Methodologies create the structure necessary to implement feedback. “By doing something in the methodology, in the setup, then when you do something wrong, you could change that in the methodology,” says Cessna. “Now, the people downstream don’t have to deal with it. And you didn’t have to train everyone on how to do it because it was in the infrastructure. You did it right by correct by construction, which helps with productivity, with yield, and in other ways.”
That’s a significant change. “We are creating feedback loops that weren’t there,” says Hand. “In the past, the system architect would do some analysis, and they say, ‘Here’s what we’re going to do.’ That was a feed-forward process. There was no feedback loop in that process. Most of those models died on the vine after they had been used. Consider requirements. Customers that are doing top-down systems design can now trace those requirements all the way through the process. If a simulation fails, they can see the implication on their product-level requirements. That’s a feedback loop that never existed in the past. That’s one of the biggest things that we’ve been able to do. The industry is closing these loops, at the hardware/software side, the requirements side, and verification into implementation.”
Feedback loops often improve design. “A stricter design methodology resulted in a 30% to 40% increase in productivity by only giving up less than a percent of area,” says Leclerc. “The productivity gains actually led to performance improvements because the layout is done sooner. You get the extraction results sooner, and designers have time to tweak things. If the loops are fast, then in an instance of time, you could get to a better design.”
Some of the feedback loops are even larger, but no less important. “As we go down to newer nodes, we learn how to do things better or add a new feature,” says Cessna. “Then we can put it back in our older nodes. It may not be necessary, and we survived without it, but why would we not take advantage of it? The rules aren’t that strict at the higher nodes and we don’t need it, but we get better performance, better speed, better efficiency.”
Conclusion
Through a combination of design practices, verification methodologies, improved tools, and tightening or creating feedback loops within the flows, productivity is keeping up with design size – even for many of the largest chips. Those increases in productivity enable the next increase in complexity. They have to go hand in hand. Otherwise, the process would stall. Designs on older technologies often reap the rewards.
1The study included responses from 1,492 people, about half of which were involved in ASIC and IC design. The margin of error for the survey is +/- 3%.
Leave a Reply