Lowering the voltage is still seen as the best way to save energy, but getting there is becoming far more complicated.
y Ed Sperling
Scaling down the voltage to boost battery life and cut energy costs has always been considered the best option, but it’s getting more difficult at advanced nodes and in stacked die packages.
The key problems are noise and leakage. Lowering the voltage exacerbates both of them, forcing a rethinking of the whole design process starting at the architectural level and continuing to new tools that will be required for existing flows, new packaging approaches, new gate structures, potentially new materials, and some really thorny business tradeoffs. If this were a game of chess there would be multiple opponents, the board would be multidimensional, the chess pieces would conspire among themselves, the rules would change constantly, and those with the most powerful pieces at the end wouldn’t necessarily win.
These problems have been creeping in since about the 90nm process node from a process standpoint alone. Add to that more functionality on a single chip, with more power supplies, more voltage islands, thinner gate oxides and the possibility of lower signal-to-noise ratios and greater leakage and the game becomes more challenging. In short, this is a tough problem to solve—and it’s getting harder.
“You always want to have voltage as low as possible,” said Philippe Magarshack, group vice president at STMicroelectronics. “And we are getting better at understanding variability in the process. But there will be more than one voltage. It will be a different voltage for memory than for the rest of the transistors. What’s changing is that there is more playing around on the edge of the design to see what is the lowest voltage that can be used.”
He noted that by using fully depleted silicon on insulator at 20nm, and FinFETs at 14nm, there will be some gains as well as the need for fewer dopants in the channel. Stacking die also will add some benefits in reducing power drop, and there is more of an understanding of how to use power islands and dynamic voltage scaling for even mainstream designs. But knocking the voltage down is still challenging, and it will become even more challenging in the future.
Noise, IR drop and stacked die
Noise is one of the biggest problems that engineers encounter when lowering the voltage. There is simply less margin for managing signal-to-noise ratio.
“As you scale down, noise increases,” said Barry Pangrle, solutions architect for low power design and verification at Mentor Graphics. “But so does the tolerance for additional noise. One of the commonly used techniques has been clock gating, but that can change the characteristics of cu rrent on a grid and cause IR drop. What we’ve seen is that large variations in current are a source of noise.”
IR drop, in particular, is the bogeyman of low-power engineering. IR drop is equivalent to voltage drop and both result in power dissipation. And with the projected voltage at 14nm expected to be somewhere around 0.4 volts, The challenges of IR drop will force more designs closer to the noise margins.
Some of these dissipation effects will be minimized by the advent of FinFETs, which provide better control of gate leakage, and by new substrate materials such as silicon on insulator. But the other trend underway is also to stack die rather than just shrink all the features—particularly for analog portions of the chip. Both noise and IR drop can affect adjacent die, in part because they’re thinner, and in part because they’re connected by through-silicon vias.
“The noise on a signal can be propagated through different die,” said Qi Wang, technical marketing group director of solutions marketing Cadence. “We’re already starting to see a wide spectrum of voltages on chips. They don’t have to be on a single die to cause problems.”
Making tradeoffs
So what’s the solution? The answer has always included a combination of new materials—copper, low-k dielectrics, SOI, for example—and design advances such as finFETs and ultimately tunnel FETs that can be implemented for a reasonable cost. It also has included a conscious tradeoff between area, power and performance. In the past several years, however, with an emphasis on more mobility and better efficiency in data centers, energy efficiency has emerged as the No. 1 goal. Lowering the voltage has always been considered the fastest way to achieve that—despite the problems associated with it.
“At 700mV, power consumption can be reduced by half,” said Aveek Sarkar, vice president of support and product engineering at Apache Design. “But as we move from 28 to 22nm and beyond, the big worry is gate oxide thickness or the maximum supply voltage before the dielectric degrades. The other issue is that if you continue to drop the voltage, performance decreases. So you’ve got a tradeoff between performance, power and reliability.”
He noted that at 14nm, the technology itself will require a lower supply voltage. “The devices and the interconnects need lower voltage. You can’t have a 1.2 volt power supply with those dielectrics.”
Insulator thickness at advanced nodes is a huge consideration. With gate oxides as thin as 5 to 7 atomic layers, process variation that’s off by just one atomic layer can have a huge effect on transistor performance.
Several other approaches are in use or being considered to deal with these problems. One approach is to dynamically scale voltage, rather than keep it constant. This technique has been widely used already. A second, particularly in stacked die, is to use different power sources for different die, which can help alleviate some of the IR problem. Still another is to re-architect the entire power delivery network, which creates its own set of problems.
“In CPF 2.0 there is a special section for modeling of LDOs (low dropouts),” said Cadence’s Wang. “Initially we had one or two LDOs and you could design them by hand. Now there are 20 or 30 and they depend on each other. The output of one is the input for another. That’s a better way to control the power delivery.”
The result is that power can be cut off using multiple levels of control rather than just one. That, in turn, can be balanced with voltage reduction where it makes sense. But making these tradeoffs isn’t so easy.
What’s needed
At least part of the problem is in the tools needed to do this kind of work. There needs to be significant advancements the analysis capabilities at the very front end of the design process.
“To the extent that we can push voltage lower we will,” said Cary Chin, director of technical marketing for low-power solutions at Synopsys. “But the best way to deal with all of this is more statistical timing analysis versus our worst-case scenario planning. For voltage and margin you really need to look at statistical methods rather than the worst case. That can have a significant impact on efficiency, and a lot more has to be done there.”
At least part of the problem is also a better understanding of use cases. Efficiency varies greatly depending on the application, the temperature and the length of the duty cycle.
“There are a large number of variables that can change the most efficient operating voltage,” said Mentor’s Pangrle. “The server folks are looking at cloud implementations, but even there the workload can have a big impact. And in 3D, with a memory stack, the bandwidth between the process and the memory can chew up a lot of power.”
There is certainly a benefit on power in stacked die because of reduced parasitics, particularly in the interconnect. He noted that in some cases, running at a higher voltage may yield the same power savings as dropping the voltage and dealing with the resulting noise and IR drop. There’s also a hidden cost in dropping the voltage too far—manufacturing yield.
“The noise and reliability of a computation result in better yield of chips,” Pangrle noted. “If you decrease Vdd, there may be fewer chips that can run at that voltage level. If you have set 250mV as your threshold voltage, it will be leakier, so you will have to pay for additional performance.”
Conclusions
Dropping the voltage will always be seen as the most direct cause-and-effect way to improve efficiency in devices, and there have been big gains already in dropping the supply voltages for many components. I/O and memory, which used to run in the 3 to 5 volt range are now in the 1-plus volt range, and there is talk they will soon be in the sub-volt range. But there also will be limits to various components on an SoC or in a 3D stack, and they will require different voltages depending on such factors as data retention, performance and proximity effects.
EDA companies are well aware of these issues. They’re also aware that tools need to become much more sophisticated for dealing with analysis surrounding dropping the voltage, what impact that will have on performance at a system level, and developing rapid what-if types of scenarios. And they recognize that it will have to be done at a higher level of abstraction, or sequentially in discrete steps that are reflected in other steps, because the amount of data that must be analyzed is growing out of control.
There is no consensus, however, on how best to deal with this. Cadence, for example, is breaking the problem down into smaller chunks. “The problem size is double or triple,” said Cadence’s Wang. “You have to deal with it locally. It’s a divide and conquer solution.”
Apache, meanwhile, is looking at analyzing everything from chip to package to system. “You really need to do dynamic voltage analysis to get the actual picture.” Sarkar said, noting that may include more decoupling capacitance (decaps), more efficient decaps or the effects of the interposer.
So far there is no consensus other than the fact that all tools are welcome and the problem is monumental. But at least there is a recognition that something has to be done.
Leave a Reply