Random variability, far more corner cases and the limits of CMOS create ugly problems for chip developers at advanced nodes.
By Ed Sperling
Timing closure, a basic operation in chip design and development, is becoming anything but basic at advanced process nodes.
Systematic variability that was at least predictable at 90nm has become random at 45nm. Tools that worked fine with two corner cases now have to deal with hundreds. And as more functions make their way onto a single die, often with multiple modes of operation and multiple voltages, getting closure for the entire SoC is becoming increasingly difficult because what is done in one area often has unexpected ripple effects in another area.
Consider a smart phone, for example. Even at 90nm, there were often separate chips for multimedia functions such as cameras or MP3 players. At 45nm, these are all on a single chip—a mix of analog and digital functionality, often with different voltages, different modes of operation for power savings.
“In the past, we all knew that a process ran faster at cold temperatures,” said Marco Casale-Rossi, product marketing manager for Synopsys’ Galaxy Implementation Platform. “That’s no longer the case. At 90nm there was a temperature inversion. Now it’s different depending on the cell. In some cells it runs faster as temperatures increase.”
The effect on timing closure is significant. There are now multiple voltages to contend with, several temperature corners and multiple different processor corners.
Fig. 1: The new physics. (Source: Synopsys)
“If you multiply all of these together you might have 60 to 100 different scenarios to validate,” Casale-Rossi said. “But if you find a violation it might not be easy to fix. The problem is that if you fix one, you may destroy other scenarios. The ability to keep everything under control is very important. That started showing up at 65nm. It gets worse as you go down to the next process nodes.”
Perhaps even worse, variability is becoming increasingly hard to model. At advanced nodes it has changed from systematic to random.
“Performance is impacted by thickness, and transistors are only two to four atoms thick,” he said. “If you miss one layer, your performance will change 25%. If you add a layer, it will change 25%. Because of atomic-level construction, timing at 32/28nm may oscillate 50%. You can design for 1GHz and it may run at 500MHz or 1.5GHz. You cannot be sure. And that’s just transistors. An interconnect at 32nm is 20 times bigger and at 20nm it’s 50 times bigger than at 45nm. A small change in the way a signal is routed has a huge impact on power and timing closure.”
What’s the time?
The trouble with timing is that it now has to be addressed all across the design flow, and many of the tools that were created don’t effectively deal with more than a couple of corner cases.
Sudhakar Jilla, director of marketing for place and route at Mentor Graphics, said that until 90nm designers only had to account for two corners—fast and slow. At 40nm, TSMC is requiring signoff on eight corners.
“When you look at place and route tools, most of them are at least 10 years old,” Jilla said. “At 0.25 or 0.18 (microns) you had to make sure two corners were not violated. “At 40nm, you have different process variations and different voltages. There are inter-die variations and intra-die variations and there are manufacturing variations. There are also aging effects. What happens to a transistor after two or three years and what happens after 10 years? In the past, you had one corner for timing. Now you have a corner sensitive to timings, another for silicon and another for power.”
So far, no tools effectively work across the entire space, although there is a big push across the board to take advantage of what is effectively a breaking point for the old tools. Mike Gianfagna, vice president of marketing at Atrenta, said there is a major disconnect between the back-end of the flow in dealing with this problem and the front-end design and architecture.
“The problem is that this traditionally has been left to the back-end team and it’s been a missionary sell to the front-end,” said Gianfagna. “The front-end guy puts together the timing and throws it over the wall to the back end, but the back end doesn’t understand the design intent. To leave it to the back end folks is asking for trouble.”
Perhaps even worse—particularly in a time-sensitive chip, which these days includes most of them—is that the existing tools are very slow. “PrimeTime is too slow to be useful,” said Gary Smith of Gary Smith EDA. “There’s a new release of the product, but because there are more corners there are really long run times.”
He noted that some companies actually aren’t achieving timing closure across the entire SoC because of the time it takes to complete that timing and the complexity of the task. That’s a recipe for problems.
Getting in sync
There are several options available. One is a brute-force approach using multithreading, multicore and distributed versions of all tools, including Synopsys’ PrimeTime static timing analysis tool.
A second approach is being more accurate in what to fix, particularly at the verification level. “If there are 100 different possible scenarios, you need to understand which ones are dominant,” said Casale-Rossi. “You verify only when you are reasonably confident of the results.”
A third option is vertical stacking of die, which is still in R&D. The advantage there is using the right process technology for the technology. A 180nm analog process, for example, can be combined with a 28nm memory with a through-silicon via and still achieve the same or better performance because the distance a signal needs to travel is actually shorter vertically than horizontally.
Conclusion
There has been much talk about holistic design, but at most companies architecture and design are not part of a holistic approach. Neither is software development, which remains a separate world even though the majority of engineers being hired at chip companies are software engineers. But engineers in many companies continue to function in silos rather than as part of end-to-end teams.
“The work gets done, although it’s painful and highly iterative,” said Gianfagna. “But it’s because the work gets done that there hasn’t been a big change in thinking.”
Two things may have a significant change on that way of thinking. One is that CMOS is running out of steam. It’s getting harder to get the job done using standard processes and manufacturing, which will force companies to re-think exactly what they create and how to go to market. While companies like Intel and IBM have road maps stretching several nodes more, it’s not certain that the bulk of chipmakers will follow the same path.
Second, the problems encountered at 28nm will get much, much worse at 22nm. Each problem encountered at previous nodes will get compounded at 22nm. There will be more corners to deal with, more timing closure issues and more problems with leakage and process variability. And at some point, companies will have to decide if they can hit a market window or be more competitive with proven older technology nodes using new tricks.
Leave a Reply