Modeling On-Chip Variation At 10/7nm

Timing and variability have long been missing from automated transistor-level simulation tools. At advanced nodes, an update will be required.

popularity

Simulation, a workhorse tool for semiconductor design, is running out of steam at 10/7nm. It is falling behind on chips with huge gate counts and an enormous number of possible interactions between all the different functions that are being crammed onto a die.

At simulation’s root is some form of SPICE, which has served as its underpinnings ever since SPICE was first published 44 years ago. But simulation now is being stretched in many ways that were never considered when it was first introduced.

Consider, for example, modeling of on-chip variation. Even though statistical timing analysis tools might indicate that the paths are okay, for example, flop hold constraints may not model enough variation. Failure to accurately account for that variation can delay or even prevent timing closure.

“Especially for 10nm and 7nm, supply voltages are coming down dramatically,” said Vic Kulkarni, vice president and chief strategist in the Office of the CTO for ANSYS‘ semiconductor business unite. “We are looking at 0.5 to 0.45 volts, but the threshold has not come down that much, and what happens is that the delta between VDD-Vt (threshold versus the power supply) creates a lot of variability issues. High Vt/low VDD is the key culprit of what causes variability issues.”

The variability issues are many and complicated. Among them are non-Gaussian distribution across stages and non-Gaussian constraint variation; a large mean-nominal shift at the end of the path; slew-related constraints and correlation effects delays; and waveform shape propagation issues.

Kulkarni noted it is the non-Gaussian distribution that is happening across multiple stages, making it much more difficult to close timing. “For example, let’s say there are 10,000-plus paths in today’s SoC in 10nm and sub-10nm. There are so many timing violations because of variability, and the distribution is non-Gaussian in terms of what happens because of the variability. So it cannot be predicted properly because it is not well-defined, well-behaved transistor behavior. This non-Gaussian constrained variation causes timing closure issues, slew rate changes, and these correlation effects add up. They are additive in terms of how long the path is. That’s why so many paths can’t meet the timing at tape out.”

And it’s getting harder at each new node. Timing and variability are related—variability is another dimension of timing. So variability needs to be considered alongside timing, as well.

“With advanced nodes like 10nm and below, the variability is putting a lot more pressure on the timing margin on the signoff tools and simulation engines,” said Sathishkumar Balasubramanian, product management and marketing lead for the AMS verification products at Mentor, a Siemens Business
.

At older nodes, designers typically pad their timing margins so they can ignore variability. That’s not possible at 10/7nm.

“With 7nm, if you look at the threshold voltages, they are very very close to the threshold or the Vt of the transistors,” said Balasubramanian. “That doesn’t leave you a lot of room to work with, and if you start putting in margins, that affects your performance and your power footprint. So you have to take into account two things. One, your simulator has to be much more accurate to address these power needs. We used to talk about nano amps. Now we are talking about pico amps. And two, simulating just one transistor is easy. You can take any industry-standard SPICE simulator and you are able to get the accuracy.”

The reason for this is that some types of circuits — embedded SRAM, for example — are based on regular finFET transistors. That’s not true for flash or DRAM, and the challenge here is that each bit cell is 6 transistors, but 512kb-level transistors are being added that need to be verified. Having the high accuracy simulator alone isn’t going to help because the simulation can’t be run fast enough in time with the required memory footprint for a big circuit like an SRAM, he explained.

Accuracy matters
The golden reference for simulation accuracy has long been traditional SPICE, and engineering teams typically looked for simulation tools to come within 10% of traditional SPICE as a reference point. This marker has now moved to be within 2% in some cases for accuracy requirements.

“That’s very tough, because if you’re designing an SRAM, for example, it’s a big circuit,” said Balasubramanian. “The moment you turn off all of the optimization options in any simulator like an RC reduction, you are taking some shortcuts in terms of how you describe the equations like in failure models. If you take away all those tricks that the simulators employ, then you lose accuracy, so it’s a Catch-22 situation. That’s where the challenge is. The challenge has always been between what the designer wants and what current tools can help with.”

So do commercial tools today deliver everything the design and verification team needs?

Many say they do. “Generically speaking, there are two different kinds of simulators. One is a Fast-SPICE, the other is a traditional SPICE-level simulator,” he said.

Fast-SPICE used to be dominant in huge circuits. There was a tradeoff of accuracy for speed, and the designers were able to get away with it. Now, however, these simulators are running out of steam. Design teams need the accuracy, but they can’t use the tricks. As a result, many companies are going back to the basics.

“From Fast-SPICE, people are coming back and saying they are not okay with using multiple cores for the simulators,” he noted. “They will give more memory as long as they get the accuracy. That said, there is movement underway of users moving from Fast-SPICE to the traditional accelerated SPICE.”

There is a large effort underway in the industry to deal with these challenges, one of which is the Liberty Variation Format (LVF). In March, new statistical moment-based extensions for LVF were ratified by the Liberty Technical Advisory Board. The new extensions provide a more precise static timing model based on non-Gaussian variation observed in designs operating at near sub-threshold voltage conditions. Applications include mobile and IoT IC designs.

Current is still king
There is an analog side to all of this, as well, where the current is and always will be king.

This is also where things can get extremely tricky navigating a design that contains both analog and digital content. Many of the issues noted above are seen on the digital side, but reference ‘analog-like’ concerns, such as threshold voltage, pointed out Steven Lewis, product marketing director for mixed-signal solutions at Cadence, meaning that the ‘digital stuff’ is acting like analog.
 
At the same time, there is still analog stuff to contend with on the chip, and there are three choices, he said:
 
1. Model it in digital (wreal modeling for instance). The advantage with this is that everything goes to a digital simulator. The disadvantage is that there may be accuracy issues. [Described above.]

2. Go mixed-signal simulation.  Model the analog and digital separately either using transistors or AMS modeling.

3. Go wild and keep the analog ANALOG and use available techniques to discover the variation issues that are occurring in the analog blocks, seeing if any of it is significant enough to impact the timing of the design, which might be discovered more accurately using mixed-signal simulation.
 
Lewis stressed, “It is the hand-over between the domains that is tricky.  We know we have variation and timing issue and nano-nodes.  Discovering the culprit and fixing it is the work of we do in EDA.”

Further, for analog designers, what’s interesting about transistor technology at the most advanced nodes is that it’s very much “back to the future.”

“FinFETs are very specific—they have very specific shapes, there are very specific ways to build them,” he continued. “You are limited in terms of the number of fins you can use. A lot of the degrees of freedom that analog folks had with planar transistors are gone—20nm was sort of the last hurrah for planar transistors. Once we went to 16, it pretty much became a finFET ball game. It’s almost taking them back to their old days where there were discrete components that they just had to make work. Also, with finFET technology it allows us to be very specific with them. We know how they are going to act, we know what the physics of that finFET transistor is going to be. That gives us more degrees of freedom in terms of how to analyze them, in terms of assumptions that we make in the simulation when we are looking at them. It also allows us to think about them when it comes to variability. It’s this discrete nature that helps us make some new assumptions that we couldn’t make in the planar world because there was simply too much variability and freedom of the engineer.”

While this adds to the benefits, it also means that a lot of assumptions cannot be made about the way a designer builds something. As such, the discrete nature of the transistors has encouraged the development of new variability/new variation approaches. One of these is referred to as sample re-ordering.

“In the analog world, we typically talk about Monte Carlo statistics,” said Lewis. “It’s what analog designers do to look at the statistical variability of their design. Over the years, the industry has developed different ways of approaching the problem, ways of speeding it up because it’s always a long haul depending on how many of the Monte Carlo runs you’re going to run. If you take the Monte Carlo, you make corners out of it or you take your corners and make a Monte Carlo out of it, there are all kinds of games that we approached it with in the past. What was introduced when we went to 16nm is a technology developed with TSMC — the ability to do sample re-ordering. Because we know how the transistors are going to work, we know what their shapes are going to be, so they are limited in their scope. From that, we can make assumptions and use sensitivity analysis to determine what the tails of those finFETS are going to look like.”

Some of the most interesting characteristics show up in the finFET tails.

“The tails are not quite the smooth rolloff that they once were with the planar transistors, so analog designers need to figure out what are they doing out in the four-, five-, six-sigma realm. Obviously they cannot run that many Monte Carlos because statistically speaking that’s millions of Monte Carlo runs if we were looking at pure mathematics.”

While this is complex, it’s better than the alternative, which is overbuilding. And this, in turn, is driving a closer coupling between front-end analog design and the layout.

Lewis said that what happens on the analog side is the design team tries to find the best fitting finFET, then they look at the layout and see what can be done to move things a little bit. “Maybe it’s making it larger, maybe it’s just moving things away from a power supply, maybe it’s putting a guard ring around a supersensitive pair of transistors. We’ll look at the layout and see how we can solve it as well.”

It’s not uncommon for engineering teams to find when they go through the physical implementation of the design that they have to learn a lot of new things, and it’s not the same as the old planar days.

“If we know from the front end how much current a route is supposed to take, it’s one thing to say the route is DRC-correct,” Lewis explained. “It has a minimum width, it has the right spacing. You can say you have the design rulebook from the foundry and it says a route needs to look like that, but in an engineer’s mind what it comes down to is, ‘Yes, we are all trying to go for minimums, and minimum spacing between the routes. Yes, you can take it to a point where you can still build it but is that route now enough to handle the current that’s going down that route?’ This is where the analog hat comes on. It says from the front end simulation we know that this is going to be a pico-amp or this is a milli-amp or a micro-amp worth of current that’s going down this line, and we need it to be as strong when it’s leaving this pin as it is when it approaches this pin at the connector.”

Electrically-aware design (EAD) technology allows the layout engineers to look at both of those at the same time.

All the while, 5nm is approaching fast, and the question remains whether transistor-level simulators will hold up.

“At 5nm and below there are still finFET-based architectures, and there will be other novel approaches such as the one from IBM and Samsung and GlobalFoundries,” noted Mentor’s Balasubramanian. “The basic architecture we have seen so far at 5nm will definitely hold right now with the simulators, at least from the perspective of the structures, such as the fundamental models, but we don’t know how much complexity there will be in terms of the models. It’s still too early. The simulators should be able to handle it, but it all depends on what the fundamental circuit is. For 5nm, finFET should be okay, but the moment they change into a totally different architecture, we don’t know yet. It’s going to be a challenge.”



1 comments

Kev says:

SPICE is only the root of classical analog simulation. There are many levels of modeling between that and the 1s & 0s of Verilog. It’s possible to use SPICE and MC at the cell level and create discrete (fast) models that embody variation of FinFET level design and can be used to give you the equivalent of MCing an entire chip.

Leave a Reply


(Note: This name will be displayed publicly)