How efficient is the power delivery network of an SoC, and how much are they overdesigning to avoid a multitude of problems?
The consumption of power and dissipation of heat within large SoCs has received a lot of attention recently, but that is only part of the issue. Power also has to be reliably delivered onto and around the system. This is becoming increasingly difficult, and new nodes are adding to the list of challenges.
“If we were building chips where there was only a single Vdd and Vss then it is not that difficult to estimate the current you need to handle switching,” says Drew Wingard, chief technology officer at Sonics. “It is not that difficult to estimate, based on the characteristics of the package, the inductance characteristics of the power delivery network (PDN) through the package board hierarchy. So you can relatively easily calculate how much decoupling capacitance you need to cover that. That is all well understood.”
But that is no longer possible. Leakage currents have been increasing with each new node, and that has to be controlled by powering down parts of the design that are being used for any useful purpose. The addition of power switching makes life a lot more complex. “Now we have a PDN that varies over time,” adds Wingard. “When I power gate something, I take away the decoupling capacitance as well as the load. This adds an extra level of complexity and can take people by surprise. It needs a fair amount of extra analysis.”
This basically means that the PDN now has to be designed. “For older technology nodes, PDN design was more systematic using a margin based approach for power noise,” explains Arvind Shanmugvel, director of applications engineering for . “Designers would typically have a certain margin for on-chip noise, package noise and board noise. This margin based approach is quickly running out of steam for more advanced technology nodes, especially due to the reduced operating voltages and complex power-thermal interactions.”
So where does one start? “For the power grid there are two fundamental problems,” says Jerry Zhao, director of product marketing for power signoff at Cadence. “The first is the voltage drop, which can cause the chip to logically fail. The second problem, which is becoming more challenging, is the electromigration (EM) rule set. EM rules, when coupled with finFETs, are becoming more complex. And there are more rules that have to be qualified with the foundry.”
While it still may sound quite simple, Sudhakar Jilla, group director of marketing in the IC Implementation Division of Mentor Graphics, lists some of the complexities associated with this: “Higher current density, degraded EM limits, higher grid impedance, increased gate density, elevated thermal coupling and impact, (from planar to finFET) 3D narrow fin structure and lower thermal conductivity in substrate causing self-heat trap, plus tighter gate pitch causing nominal temperature to raise and having higher EM impact.”
In addition to these issues, which make analysis difficult, there are some fundamental design decisions associated with the PDN, as well. Wingard provides one example about how to handle inrush currents. “Bigger transistors with lower resistance would appear to be the better choice for the power switches. That would allow me to use more local decoupling capacitance. However, if the domain is switched off for a long time, when switched back on, I have to charge all of the internal parasitic nodes and the well. The amount of current required is highest at that first moment and can be so high that it is higher than what the circuit would encounter during normal activity. So the PDN has to be designed to deliver that much or the inductance in the leads may mean that I encounter voltage droop. While this is not a problem intrinsically, there may be other circuits trying to run at full frequency, and this means a loss of power supply integrity. That can cause paths to slow down and miss deadlines, and that could lead to incorrect functionality.”
As with many design issues, each new node adds to the list of challenges. With the larger geometries, power is distributed across the higher layers of metal, where sizes are larger. However, “the resistivity of the metal layers has increased due to the narrowing of wires,” says Shanmugvel. “This translates to larger IR drop across the board. In addition, the lifetime limits for the metals and vias are smaller, leading to tougher EM closure.”
Add to that the complexities of new devices. “FinFETs have problems with self-heating, and the heat spreads up through the device to the metal layers,” says Zhao. “The wires can also generate heat due to the currents passing through them. When temperature rises, your EM limitation decreases and leakage will increase.”
Vias also need to be carefully controlled. “You still need to get the power down to the local fabric,” says Wingard. “This means you have to decide how many vias are needed between this layer and that layer so that I don’t end up stressing any one of the vias.”
“With 20nm and below, the cell rails are now routed on the double patterned (DP) and self-aligned double patterning (SADP) layers,” says Mentor’s Jilla. “They can have special routing rules and lower current carrying capabilities that can impact getting power to cells on those specific rows.”
Zhao agrees. “10nm will be the first generation where, even for the power, they care about the color pattern.”
Jilla believes that this spreading of the PDN across layers may have some secondary benefits. “You end up with a more distributed network. Rather than just large stripes on the top two layers, there are now smaller grids on all layers. This helps to alleviate the large currents through stacked vias from the upper stripes down to the cell rows and can result in a more even distribution of the power currents.”
Other impacts are implicit. “At 10nm and 7nm, operating voltages have reduced to 500mV range where there is a thin margin of error between functionality and failure,” adds Shanmugvel. “This translates to a fine line between over-designing and under-designing from the designer perspective.”
Focus on capacitance
Design is about achieving the desired goals at the lowest possible cost. “The PDN must provide a low impedance return path for signal currents and maintain proper functionality,” says Hem Hingarh, vice president of engineering for Synapse Design. The problem is then switching noise. “To reduce switching noise, decoupling caps are usually added with placement distributed at various locations on the die. So in design you have to select the number of de-caps, their value to keep good performance for the power network, and where they should be placed?”
But even that can be a tricky design decision. “Decoupling can provide local charge that can help, but it can also cause the chip to slow down at times,” adds Zhao. “Decoupling capacitance optimization and power ramp-up also factor in where you are going to place the power switches and how big the switches should be in order to sustain the inrush currents. Inrush current can be 5X larger than operational current.
That could easily lead to overdesign. Wingard lays out a couple of strategies that can be used to control this. “You don’t want to hit the PDN with a large voltage difference and a low resistance at the same time. You can either turn it on in phases, so you break a domain into 10 pieces and at any time there is only 1/10 of the capacitance I am trying to bring up at the same time. Now the peak charge at that moment is smaller and then I turn on the next one and cascade them. That reduces the peak current. The second option is to use weighted transistor widths, so first you turn on some less beefy transistors that have a higher resistance. Now, instead of playing with capacitance I am playing with resistance. The peak current would be lower, and when it has gone a reasonable fraction of the way I may turn on a second set of power transistors that have a lower on resistance. The voltage difference is smaller.”
While this overcomes the stress placed upon the PDN, the implied cost is that it now takes longer to bring up the part of the design that is needed and so slows down the overall operation of the chip.
Performance is only one cost. There are many other direct and indirect costs. “If a grid is not properly designed the die-size could quickly get out of hand, impacting both cost and time to market,” says Shanmugvel.
And it doesn’t stop there. “Dense PDN in the double patterning layers brings complicated placement and routing requirements which results in additional area and cost penalties,” points out Ming Ting, product marketing manager for the IC Implementation Division of Mentor. “Due to restricted IR/EM requirements, the PDN is usually evenly distributed across the whole chip and this can take 10% to 20% of valuable routing resources.”
Nobody can design an energy distribution system that is 100% efficient. The power grid into our homes is estimated to be less than 50% efficient. Power is wasted in generation, distribution, conversion and there is a lot of energy wasted in performing functions that serve no useful purpose.
Nobody has managed to calculate the total efficiency of an SoC. Some power will be lost in the interconnect, in the power switches and not all switching activity is useful. This is in addition to leakage power that serves no useful purpose. The only person who would even hazard a guess at any part of this is Alin Florea, senior manager of package engineering at eSilicon. “We try to design the PDN to consume under 5% of the total power budget,” he says.
As with many aspects of chip design, there are tradeoffs. “Accurate models and design tools help designers walk the fine line between costly over-design and unreliable under-design,” says Herb Reiter, president of eda2ASIC Consulting and the chair of the ESD Alliance’s System Scaling Committee. Unfortunately there are very few design tools to help with this problem, although the analysis tools are becoming a lot more sophisticated.
Part two of this article examines additional complications that face those thinking about 2.5D and 3D designs and serves up advice from the experts about how to handle some of the PDN tradeoffs.
FinFET Scaling Reaches Thermal Limit
Advancing to the next process nodes will not produce the same performance improvements as in the past.
Electromigration: Not Just Copper Anymore
Advanced packaging is creating new stresses and contributing to reliability issues.
Thermal Damage To Chips Widens
Heat issues resurface at advanced nodes, raising questions about how well semiconductors will perform over time for a variety of applications.