Checking the electrical characteristics of circuits is becoming much more challenging.
Having enough confidence in designs to sign off prior to manufacturing is becoming far more difficult at 7/5nm. It is taking longer due to increasing transistor density, thinner gate oxides, and many more power-related operations that can disrupt signal integrity and impact reliability.
For many years, designers have performed design rule checks as part of physical verification of the design, which is always the final gate before the tape out of a design. That is no longer enough.
“It turns out on the electrical side, there’s always something similar, but not many people talk about it because designs were not as complex, design teams usually had a very well-defined foundry partner, and they’d probably design everything on their own.” said Geoffrey Ying, director of marketing for AMS products at Synopsys.
Fig. 1: Various parts of electrical rule checking at advanced nodes. Source: Synopsys
Electrical rule checking was something that could be done, but it wasn’t necessarily something that had to be done. This changed with the rise of the semiconductor foundry manufacturing model, which made it possible for more companies to move to advanced nodes and paved the way for an explosion of third-party IP. But externally developed IP also meant that design teams sacrificed some control over how circuits are put together, which made electrical checks in an SoC critical.
Electrical rule checks have been around for 30+ years, but they are becoming much more stringent at 7/5nm due to thinner oxides, increased dynamic power and transistor density, and increasing leakage current.
“Some devices are in place just to control leakage,” said Carey Robertson, product marketing director at Mentor, a Siemens Business. “Previously, there were transistors driving a specific area of functionality. Now there can be thousands, if not millions, of devices created automatically by the process just to control the leakage.”
The problem is that many of these IP blocks are essentially black boxes. As a result, designers need automatic techniques to make sure they’re implemented appropriately and not causing problems.
But that’s only part of the problem. While foundries have begun qualifying IP at advanced nodes, there are a host of issues on the manufacturing side such as process variation, unexpected defects, and manufacturing drift that require restrictive design rules to achieve sufficient yield. So rather than just designing chips and handing them off to manufacturing, design teams now need to work much more closely with foundries, EDA and IP companies to develop something that will yield well enough to warrant the massive cost of developing a chip at these advanced nodes. That includes everything from the initial architecture stage all the way through to testing during manufacturing, all of which needs to be worked out very early in the flow.
“If you don’t have really good coverage, leakages may be at the noise level and you might not find them,” said Thomasz Brozek, a fellow at PDF Solutions. “With finFETs, you need a really big SRAM to find leakages, shorts and opens. And with gate-all-around, it’s a lot worse. There is a lot of self-alignment. Materials are removed and replaced. With e-beam inspection and SEM, you can see what’s happening from the top, but when you remove material from under the nanowire you need to make sure it’s not leaky.”
Most design teams rely on existing tooling and equipment to solve these problems, but that only works if there aren’t tweaks to the IP or other parts of the designs. Many companies working at advanced nodes see those tweaks as a competitive advantage, but they also make it more difficult to solve some of these issues.
Consider power domains, for example. There are tens to hundreds of different power islands on a complex chip, which provide lots of opportunities for latch-ups and electrical overstress conditions. There can be a high-voltage region driving current into a low-voltage region, which can cause all sorts of complex power-related issues.
“There are huge numbers of power islands that need to be checked against, whereas previously maybe there would be two to five of those,” Robertson said. “And because these designs are so big, they’re sourcing lots of IP and they’re not necessarily intimately familiar with every piece of circuitry that they’re integrating against. Previously, to analyze reliability, design reviews were done whereby many companies would have ESD (electrostatic discharge) experts that would review the design, along with latch-up experts, and look for best practices. Number one, that’s a very manual process. Number two, there may not be one person or a group of people that are familiar with all circuit implementations of that design because you bought some IP or it came from India or some other region. This also drives the need for more automatic techniques.”
With the introduction of finFET nodes at 16/14nm, and spanning all the way down to 7nm/5nm and below, designers and tool vendors need to think about physical and electrical design constraints and process requirements. “Dealing with local physical effects, new EUV layers, and increasing EMIR has required the flows to be significantly more integrated with a heavier reliance on shared engines and algorithms so implementation tools can ‘see’ the same transistors and wires as the signoff tools,” noted David Stratman, senior principal product manager at Cadence.
Stratman said that design tools have added features like multi-patterning, trim/cut metal dummy fill, automated via pillars, and self-heating, while also blurring the lines between flow stages—making STA IR-aware, synthesis fully physically-aware, etc. These new software architectures allow implementation tools to account for advanced design rules during PPA optimization in order to continue to achieve predictable correlation. “The key is pushing these electrical and physical checks earlier in the flow to maintain designer productivity through the full-flow pass while still achieving performance goals after signoff,” he said.
There is general consensus that electrical rule checking is now a critical part of the physical design process at advanced nodes.
“The traditional spacing rules for interconnect are now made much more complex and take into account electrical rules,” said Oliver King, CTO of Moortec. “In general, these things are handled by the EDA companies providing flows for ERC and foundries providing rule decks. Increasingly, design teams are being more aggressive with voltage levels on chip to squeeze power and optimize performance. This also needs to be taken into account as part of a voltage-aware sign-off flow.”
Much of this also needs to be monitored on an ongoing basis, watching how a chip performs from inside and looking for potential problems that may crop up. Moortec has been monitoring thermal fluctuations, for example, which are a sign of aberrant behavior. That kind of data can be collected and analyzed using AI to identify patterns and outlier data points. Others, such as PDF, have been pushing for more structures on chip to send out data analyzing a variety of factors that occur on a die while it’s being used under real workloads.
“You can put structures onto a chip that monitor everything from contact/gate weakness to breakdowns in leakage,” said PDF’s Brozek. “You can do this in the fab from the scribe lines during manufacturing, but you also can instrument the surface of the chip where you can identify weak structures and latent defects that you would not otherwise see.”
Multiple power domains
Over the last several years, the effect of low power verification and multiple power domains has become painfully evident on power signoff, said Synopsys’ Ying. “What that really means to circuit simulation is a design will have multiple power domains, such that a transistor in one power domain needs to be very different than the transistor in another power domain. Further, if there’s a violation, it’s impossible to catch with a normal circuit simulation because the dynamic circuit simulator would be treating this as part of the design and then just run through the simulation.”
This is why designers at advanced nodes have been looking for a way to augment the dynamic simulation. “Dynamic simulation with SPICE is still the most common way to find electrical rule violations, but there are classes of problems that cannot be done easily with dynamic simulation alone as the coverage is just not enough. You need to exhaustively go through the design and look for these things. Another consideration is that dynamic simulation is driven by vectors, and you may not have the right vector to find the particular error,” he explained.
For these reasons, static approaches have grown in use, especially since it can be run at the SoC level.
But it’s not the only path to power signoff, and no single approach solves every issue. Mentor’s Roberts stressed that this must be addressed from an industry perspective because not every company has tools for each technology issue to be simulated.
“This has to do with the time to develop new techniques,” he said. “This problem has always been there and [engineering teams] have always wanted automation and sophisticated techniques, but we had to do other things like make sure the DRC was fine and timing was fine, etc. Now, I believe we’ve reached critical mass within the industry to develop new circuit techniques, primarily out of need but also based on bandwidth where there is now automatic checking for ESD, latchup, and electric overstress.”
Other players in this ecosystem include ANSYS, Cadence, Synopsys, Silicon Frontline, among others.
Standards
A standards organization, ESDA (Electrostatic Discharge Association) has devised recommendations for ESD and latchup, which is a start. Participating companies work toward adherence to these recommendations, including TSMC on the manufacturing side. GlobalFoundries and Samsung also are developing rules to be compliant with those recommendations.
“It’s a start,” Robertson said. “That’s a slice of probably the biggest concern when it comes to electrical rule checking, but it’s not everything. There’s still a lot of work to be done for other things like electrical overstress and temperature specs, but we’ve started as an industry to develop common best practices. It’s not just a single vendor solution. There are some good techniques from a variety of companies.”
Conclusion
So what should designers keep in mind when beginning a design to make sure that the ERC and other power issues will be covered? The best place to start, Robertson suggested, is to see what the foundry provides. “If the engineering team is designing at a leading node, then they probably have a good starting point from the leading foundry providers because there are checks and verification that are in DRC, LVS, and other physical verification and power verification tools. Before running into a verification error, the design team should make sure they understand those methodologies so that they’re designing with those requirements in mind.”
Then, if the foundry offering falls short of the design goals or some internal best practices that are not covered, such as electrical overstress or negative-bias temperature instability, design teams need to reach out to their suppliers and tools vendors. At advanced nodes in particular, one size does not fit all. “If you’re in the mobile space, you may have different concerns about electrical overstress than if you’re in high performance servers,” he said.
You can’t work at that level without functional mixed-signal tools that understand power and thermal effects; Cadence broke Verilog-AMS out of the gate, and Synopsys and Mentor never implemented it properly, then they refuse to support even discrete modeling of analog effects in SystemVerilog.
A personal pet peeve I’m going to rant about here. Having been involved in both the manual design review side, the rule writing, the DRC/LVS coding, and the process library development of ESD, Latchup and EOS rules for almost 25 years now… I have this to say. Too many button pushers. Everyone expects the tools to do it all, and the problem I’m routinely facing are customers going to sign off and we have to review 100,000+ ESD/LUP/EOS violations because the guys writing the rules don’t think like a programmer, the programmers coding the rules don’t have a clue what they are coding, and consequently the problem of dealing with false flags is routinely pushed onto customers who then are being forced to sign waivers for rules they have no idea why they are being flagged or how to really fix it without a completely inefficient use of silicon. Lots of good tools out their but it’s hard to get them all integrated or process calibrated correctly as well. Consequently these designs are routinely requiring more and more manual review manpower to deal with just false flags and tool issues.
Agreed!
I can understand your pain as I’ve witnessed millions of false mesaages reported when running Conformal Low Power, which is just another power sign-offs tool.