The Interconnected Web Of Power

Why is power estimation so hard? Until it can be estimated accurately enough it is difficult to control, and that means continued over-design.


Tradeoffs between area and timing used to follow fairly simple rules. You could improve timing by adding area, and occasionally find an architectural solution that would decrease both at the same time. With physical synthesis the relationship became a little more complicated because an increase in area, say to make a drive larger or add another buffer, might upset the layout. That, in turn, could result in a non-intuitive worsening of performance, possibly somewhere else in the design. But these things were still manageable, and convergence has not proven to be too big of a problem.

With power comes complications. Activity draws power and the expenditure of energy creates heat, which affects timing. Attempts to reduce unnecessary power consumption add area and create complications such as surge currents that create noise, and can lead to voltage reduction (IR Drop), electromigration (EM) and a host of other less savory outcomes. If verification is described as an art, then this aspect of design is most certainly an art as well today. The first stage of being able to turn it into a science comes with understanding, estimation and analysis, and we clearly have work to do there.

Norman Chang, vice president and senior product strategist at Ansys starts by outlining some of the basic connections. He tells us that as temperature increases, three components will be affected. The first is Power Consumption. “Leakage is an exponential of temperature. As temperature increases, leakage increases, which in turn could increase temperature. And if you don’t have a sound thermal dissipation channel you may run into a thermal runaway issue.”

The second component is resistance, according to Chang. “This increases with temperature, and an increase in resistance slows down the signals. This can be very detrimental to DRAM. Consider a four-bank DRAM. If one bank has a thermal hotspot 15 to 20 degrees higher than the other banks, it will cause an unbalanced refresh rate among the 4 banks and this will eventually cause the DRAM to fail. This can be in a DRAM or a 3D IC stack.”

The third component is the electromigration limit. “When temperature increases, the EM limit can decrease significantly. You may have a thermal gradient that is 10 or 15 degrees for bipolar and for Gallium Nitride you may see even larger gradients up to 40 degrees. The problem is that the EM limit will be different for hot spots around the chip,” he says.

When we add in power mitigation techniques, the problems multiply. “The presence of multiple Power Gating introduces additional challenges like ensuring the presence of ESD protection circuitry between all possible domain pairs,” points out Nayan Chandak, senior area technical manager at Ansys. “Similarly, the introduction of power gating greatly reduces the off-state leakage, but introduces additional risk of high rush current and noise coupling during wake-up operation.” Chandak adds a third example. “The move to finFETs bring lower leakage and faster performance, but also lower noise margins (higher switching current density coupled with lower operating voltages), and hence serious power integrity and reliability (EM/ESD) challenges.”

Lest we think that all of the problems are happening down at the micro level, there are others issues at the macro level. “Power is multi-faceted and the scope you have to look at is the package,” says Steve Carlson, vice president of marketing for low power and mixed-signal solutions at Cadence. “It may be meaningless not to have some model of the board, as well. There used to be a handful of standard package types that were very well characterized, but today people are using custom approaches to form factors and there are not a lot of standards in place. You need cross-fabric analysis engineering that can delve into and understand the IR drops inside the chip as well as looking at noise and integrity issues.”

Today, almost all power design is performed at the RT level. “Early power numbers from RTL can be used for early power grid integrity checking and to enable an early chip/power model that can be used to analyze the package,” says Preeti Gupta, director for RTL product management at Ansys. “Assume you have a large block and the Clock Gating. When the block is activated there will be a large surge in current as all of the clocks start changing. This can cause a large transient current which, when coupled to the package inductance, can lead to a voltage drop which in turn can lead to a timing failure.”

Estimation accuracy
If we understand the physics of what is going on, it would seem as though we should be able to Power Estimation power, thermal and their knock on impacts. But, we want the answers fast and that creates a tradeoff. “Designers at the RT level want estimates that are within 20% of final silicon,” says Chang. “Some companies claim that their tools are more accurate than this, but that is a wish. For the general case, they cannot be this accurate. Usually, 15% is already a very good result.”

Carlson agrees: “We would like to be able to get power estimates that are within 15% of silicon. This seems to be what will make people happy. If we can magically do that, we will be okay.” Carlson points to the reason why accuracy is so difficult. “Accuracy is dependent on switching stimulus, so you may hit the number for one use-case but miss wildly on something else. Consider that you have a heterogeneous multi-core design with cache hierarchy and a bunch of DDR planes connected to high-speed memory plus some hardware accelerators for video and audio. Now factor in what is allowed to be turned on at the same time and which process corners I am considering. I can’t just use worst-case numbers because that will price me out of the market. It is complicated.”

Chang adds another reason why it gets complicated. “Thermal has a much larger time constant compared to electrical effects. Thermal changes in the millisecond range. The average peak power is computed over 1mS of activity.”

There are several ways in which estimation can be brought under control. “By running analysis over millions of cycles of simulation data, we can zero in on the important areas and then pass that to transient tools for a deeper analysis,” points out Gupta. “That way we can find the important cycles that the designers need to care about. Transient analysis can only work for microseconds, so these need to be applied only to the interesting periods of time.”

Another way is to look at the problem differently. “There are two kinds of accuracy,” says Anand Iyer, director of product marketing at Calypto. “Absolute accuracy and relative accuracy. Absolute accuracy is when we say that something will consume 1W and when it actually runs in silicon it consumes 1W. Relative accuracy is when you say there is an opportunity to save 100mW, and after we go through the whole development process we may or may not meet that absolute number. It needs to be realistic. Many of the tools work okay for the older nodes, but once we got to 28nm and smaller there were additional factors that affected power, and the accuracy is beginning to suffer. To get back to the desired accuracy, they need a lot of tuning.”

Raising the power abstraction
It has been said that while power can be saved at the RT level and below, 80% of the power budget is affected by high-level decisions. By choosing the right architecture, activity, power and heat can be optimized to start with. “The enablement of low-power or energy-aware design needs to be able to abstract the critical power characteristics of a piece of IP and make those available in the virtual prototyping world,” says Alan Gibbons, power architect within Synopsys. “Once they are available we can use it to make intelligent architectural decisions in both hardware and software about energy efficiency. What is missing today is an interoperable way to do this, and that is what we are starting to see in the standards organizations (IEEE 1801, IEEE P2415, IEEE P2416). These will standardize the abstraction of power characteristics for a piece of IP and how we use them.”

The use of virtual prototypes is also seen as an important step is getting the necessary use-cases defined. “Architects are shifting to use more dynamic simulation, so the virtual prototype can realistically reproduce the application workload and show the dynamic effects on different parts of the system in terms of performance and power,” Gibbons says. “This enables them to see where they are making mistakes in the architecture. That could result in performance issues that could lead to under-design, or power issues that could lead to overdesign. It is exciting for them to be able to see these together.”

While many of the solutions may still be nascent, the impact is today. “We believe that the thermal envelope has become the limiting factor in performance across a number or products,” says Alan Aronoff, director of strategic business development at Imagination Technologies. “Power consumption is also critical in cost-sensitive devices where elimination of heat will result in increased end product costs.”

Without tools, overdesign is the usual outcome. “Looking at these challenges in an isolated manner is proving to be increasingly inadequate,” says Chandak. “In most IC design houses, methodology experts come up with a global pessimistic power grid spec to meet worst-case power and EM/IR requirements. These are replicated by chip designers in their physical design flow. However, this overdesigned grid is leading to serious routing congestion and timing closure challenges for finFET based designs.”

How is the gap going to be closed? As an example, Chandak calls for a new holistic solution to create an optimized, design-dependent, power delivery network (PDN). “The PDN should be robust enough to handle high power/performance areas and relaxed in other areas to meet the overall power and routing targets. The PDN should be optimized not only for EM/IR and routing-congestion, but also for all related design aspects such as power-gate planning, ESD planning, thermal integrity, and chip-package interaction. Also, PDN creation should be tightly coupled with sign-off analysis to ensure optimum results and productivity. It should have strong prototyping or implementation capabilities, as well as very good modeling techniques to mimic silicon behavior early in design stage such as early dynamic drop and early rush-current.”

Many such solutions are needed for all aspects of power-aware design, but without consistent and accurate enough estimates, everything is guesswork. “Early estimation is important for schedule and cost reasons,” says Gupta. “If you can save a metal layer, this can have a significant impact on profit margins.”

Leave a Reply