Rethinking Power

The growing emphasis on battery life and energy costs is prompting tools vendors to rethink what they offer, how it gets used, and by whom.

popularity

Power typically has been the last factor to be considered in the PPA equation, and it usually was somebody else’s problem. Increasingly it’s everyone’s problem, and EDA companies are beginning to look at power differently than in the past.

While the driving forces vary by market and by process node, the need to save energy at every node and in almost all designs is pervasive. In the server market this is simple economics—many cores do not have to be on all the time, and the cost of powering them and cooling them is expensive. In the mobile market, leakage was a problem up until 28nm, and at 14/16nm the increase in dynamic power density is a problem that will get worse at each new node. But even at established nodes—40nm to 90nm and higher—the Internet of Things and the need for more time between battery charges or lower energy usage for always-on devices has pushed power issues front and center for many chipmakers and systems houses.

That leaves a number of big challenges for EDA vendors:

• Integrating incompatible data from multiple sources the way it has been done in other areas, such as physical design through verification.
• Making the tools simple enough to use so that companies don’t require a team of power experts.
• Raising the level of training on power so that engineers are capable of using the tools and aware of power-related issues.
• Creating tools that can drill down to the RTL level, then jump to a higher level of abstraction to be able to make tradeoffs to optimize for power.

Until the IoT moved beyond the PowerPoint stage, and until getkc id=”185″ kc_name=”finFET”]-based designs began reaching tapeout, there was fewer compelling reasons for EDA vendors to create new tools or to link those capabilities with other tools in a flow. For the most part, power was the part of the PPA equation that was most likely to miss a deadline in the first version of a chip, with improvements to be made at a later date either with a software patch or a future hardware release. But the urgency has increased for staying within a tight, often shrinking power budget, and so has the attention being paid to power exploration, modeling, debug and verification.

“There is no such thing as a push-button solution for power exploration,” said Guillaume Boillet, technical marketing manager at Atrenta. “The way to address this is to integrate information from timing and think about how you are going to build a bridge between that information and power. The vast majority of people care about power, but what they’re doing now is power profiling, not power optimization. They determine how good a job they’re doing by clock gating, but there are things that can be done to make clock gating more efficient.”

Power exploration at this point is largely iterative and very time consuming, particularly for a complex SoC. Just meeting timing in designs is a feat, given the amount of dark silicon and potential electrical and physical conflicts that can arise during the design process. Throw in analog circuitry and multiple power domains and power very quickly can slip to the bottom of the must-fix-immediately stack.

From power expert to system expert
This is made far worse by the fact that many engineers don’t work directly with power. Making every engineer a power expert is unrealistic, but there needs to be at least a base level of understanding about power at all levels of the design process. Mobile chip and device makers have become quite adept at this, but the knowledge needs to be passed on. Currently, the expertise resides with small teams inside the very largest chipmakers. It needs to be understood much more broadly, and applied much earlier in the design process.

“The complexity of the systems we are designing today demands an earlier look at power,” said Alan Gibbons, a power architect at Synopsys. “Expecting to wait and solve system power issues during physical implementation of a very high performance complex SoC with dozens of IP titles supporting a significant layered software component is simply not tenable. In many cases the challenge is a result of power and performance duality. Power (or energy) has become the dominant design constraint and so we now strive for the best entitled performance we can get. In other words, how do we deliver the best performance for our platform within the power and energy budgets that are available?”

Tools certainly can help in that regard, but only to a point. As history in low-power design has shown, just because something is capable of saving power doesn’t mean it will be used — and that applies both to capabilities that are built into chips and into tools.

“There is one case where a company was developing a chip and they were trying to figure out what was causing noise,” said Vic Kulkarni, senior vice president and general manager of the RTL power business unit at Ansys. “They traced the problem to the floating point arithmetic, so they changed it to fixed point to reduce noise. When the front end was connected to the back end they found the noise decreased by 20dB and there were no more spikes. So there is real value to having people around with this kind of knowledge.”

But there also is value in being able to solve less-complex problems without hiring power specialists, and that’s where automation has always played a major role. In formal verification, for example, after years of trying to get verification engineers to learn how to write assertions big EDA companies changed direction and automated much of that process.

“Verification vectors are for verification, not for power,” said Kulkarni. “And OS boot up, if it’s not properly designed, may be inefficient. What needs to happen now is that we need to connect the dots into the back-end tools for thermal effects to create a real system-level solution. If you look at the flow at some big companies, they’re already using RTL power models and extracting that at the system level to run in the Docea Power framework. For substrate noise, you can extract that without bothering the designers.”

FinFET power issues
To be clear, power concerns above 28nm and below 28nm are very different. With finFETs, (16/14nm using a 20nm back-end-of-line process), the big concern is dynamic power density. At 28nm and above, the big concern is leakage. And with 28nm FD-SOI technology, the focus is on body biasing and improving energy efficiency because the leakage has been controlled.

These are different views into the same problem, though, which is how to deal with power more effectively. At the most advanced nodes, techniques such as dynamic voltage frequency scaling, multiple power domains, electromigration and RC delay are well understood by the companies competing there. Less well understood is how to optimize the tools and methodologies.

“One big change is that in the past you would break things down to the block level to analyze it and test it, and then you would stitch it back together,” said Rahul Deokar, product management director for digital and signoff at Cadence. “The problem is that at 16/14nm you have 100 million to 200 million instances, and when you stitch together 100 to 150 blocks into an SoC, area, performance and power are all impacted. So now you need to do it in 5 million to 10 million instance blocks rather than 1 million-instance blocks.”

That also plays well for subsystems, which can be up to 25 million instances. Deokar noted that other advanced techniques are being employed at the most advanced nodes, as well, including clock and power meshes. But what’s key is that all of these techniques are being employed more frequently because design teams need to find ways to squeeze more power out of every design, no matter where they are on the process node curve.

“At older nodes, designs used to be guard-banded,” said Arvind Narayanan, group marketing director for the IC Implementation Division at Mentor Graphics. “Libraries got guard-banded to reach signoff. But that’s not sufficient as you move to smaller nodes. Guard-banding no longer works. On top of that, you have to look at the impact of thermal on IR drop. That’s beginning to come into play, too.”

Planar concerns
Above 28nm, tools have existed for some time to handle basic power problems. Their use has been spotty, however, because companies haven’t focused as much on power, and their inclusion in flows, let alone within the engineering curriculum of universities, has not been a high priority. That is changing, in some markets faster than others, but progress is reported across a swath of vertical markets, and schools have begun building power into engineering curricula as one more thing future engineers need to learn.

Now the challenge is to change the mindset of engineers who have ignored power for so long. “Power methodology has taken a backseat to die closure in many designs,” said Sudharkar Jilla, group director of marketing for place and route at Mentor. “First you make sure the timing works. Then you worry about power.”

He said mobile and wireless chip companies have been focusing on power all along, but the growing importance of the IoT has now made the concern much more widespread. “In addition to tools and power experts, there’s a need for methodology changes. Training on EDA tools needs to increase and these tools need to go mainstream.”

In some cases, tools already being used for other purposes can be utilized for power. Emulation can be used for hardware and software verification, but it also can be used for power analysis and verifying power, for example. The same is true of high-level synthesis.

“You can run an RTL power analysis, take those power analysis numbers and use them to try out different versions with different constraints,” said Mark Milligan, vice president of marketing at Calypto. “So your first pass is for RTL power accuracy. You want to make that as accurate as possible, which is plus or minus 15%. Then, once you pick which direction you’re going, you need even higher levels of accuracy. But it allows you to do trial placement and routing and make adjustments to RTL based upon power. Power analysis is just starting to happen for many of these companies.”

Having a higher level of abstraction is critical because the amount of that needs to be digested is enormous.

“We are doing this work at higher levels of abstraction, and this is necessary in order to be able to deal with the complexity and to provide the simulation speed we require,” said Synopsys’ Gibbons. “It is important to note that the power problems we are trying to solve during system-level design are not the same power problems we attempt to solve in the latter stages of physical implementation. We want to use power data to make the big decisions early in the design flow — choice of heterogeneous processing architecture and configurations, which functions exist in hardware versus software, memory size and topology, interconnect configurations, system power management strategies, algorithms and policies. Typically, these decisions don’t need absolute accuracy of power consumption data and decent estimates will be more than adequate. Accuracy of power state residency is just as important as power consumption. In order to gain sufficient information of the energy behavior of our systems, we need visibility of both power consumption per power state and the power state residency. For a given scenario, understanding how long a block stays in a certain power state is important. Only by simulating the platform under representative workloads can we gain this understanding.”

More changes ahead
What is unknown at this point is what happens as more engineers begin tackling the power issue. For example, what happens when architecture becomes the next choice for saving power rather than optimizing existing approaches? A whole portion of the industry is preparing for the possibility of stacked die—both 2.5D and 3D IC—which can have significant impacts on the amount of energy required to drive signals because of both shorter distances and lower resistance and capacitance.

“Looking at HBM, one of the things about that standard is that you’re now going through a silicon interposer in the more traditional configurations so your memory channel now is much shorter and you’re not driving a long signal across a PCB or into a DIMM socket,” said Frank Ferro, senior director of product marketing at Rambus. “You’re driving a very short channel through a silicon interposer. That gives the opportunity to make that interface more power efficient on a per bit basis. While the overall power might be higher because of the number of bits, the power per bit should be a bit more efficient.”

That’s just one of the options under consideration, too. “We are looking forward to what we can do for DDR-5 and even some of the HBM interfaces, where the goal is to start to bring the power down through things like better signaling technology for DDR-5, and to start going into either lower-swing signaling to save power on the interface,” Ferro said, noting that could include moving some of the control back on the PHY, so that the memory can be more of a slave.

More innovation will certainly come out of this. Engineers like to solve problems, and the more they focus on power the more innovative approaches will result. EDA tools are one facet of this, and they will both drive and react to changes as they reach critical mass.