Second of two parts: Where and why designs go awry and how to avoid some increasingly common problems.
Reducing power has emerged as the most pressing issue in the history of technology. On one hand, it’s the biggest opportunity the electronics industry has ever seen. On the other, the abuse of cheap power has been linked to global warming, human catastrophe, and geopolitical strife. In all cases, the semiconductor increasingly finds itself at the vortex of all of this, and making chips more energy-efficient has become a top priority everywhere.
From the coolest consumer electronics to the most mundane sensors, efficiency is now a priority. The focus on energy efficiency and power, which is the use of that energy over time, has become a top consideration in all aspects of chip development, regardless of process node, transistor shape, how many IP blocks are being integrated or how many functions are on a chip. In some markets, such as wearables or mobile electronics, battery life is the primary defining factor.
“The main issue in wearables is power consumption,” said Eran Briman, vice president of marketing at CEVA. “Power consumption is a very important constraint. The size of the battery is small, so there is a weight limitation, but the device is always on, listening or watching. And you have to make all of this cost-sensitive.”
At the other end of the spectrum are devices with a plug, such as a dishwasher, where energy efficiency is listed on tags that indicate average annual operating costs rather than how long a battery will last or how many miles a vehicle can run between charges. But even there energy efficiency is becoming more important.
“Some industries are more demanding about power,” said Lawrence Loh, vice president of engineering at Jasper Design Automation. “That can affect which IP to choose, how the system architecture works and the software, too. But there are some areas that have a lower bar, such as appliances, where you’d like them to be low power but it’s not that strict. In all of these devices you need to verify everything works, but for power that’s way too late to do something about it.”
Where designs go wrong
What’s less obvious is that the window for actually affecting power is moving further forward in the design cycle. Just as with cars and mileage, IP power consumption may vary by user. Blocks don’t always behave as expected from an overall power budget standpoint, particularly when power domains are turned on and off irregularly. With the Internet of Things, for example, some devices are always on, while others may be turned on for brief periods—use models that are well outside of typical SoC designs.
“You need very clear partitioning because the IP needs to meet the power quota,” said Loh. “System people have to decide how many power domains there are and the spec for the software, and then the software people need to do the right thing to utilize the hardware. With careful planning and communication, you hope it all works together. But the reality is that the system and hardware engineers hardly talk to the software engineers, and the software guys probably underutilize the hardware. And even if you do utilize all the cores in a processor, you probably are going to utilize them in expected ways.”
He’s not alone in seeing this effect. Jon McDonald, technical marketing engineer for the design and creation business at Mentor Graphics, said it’s relatively straightforward to define what needs to be done with IP using a high-level analysis. The devil is in the details—and in the case of complex SoCs, there are an awful lot of details.
“IP can be used in many ways, and even if you design something within a power budget many embedded systems will not work correctly if everything is on at the same time,” said McDonald. “You can model all the use cases and determine how much of the system will likely be on at the same time or at all times, but you also need to make sure you set the appropriate controls for what is really critical to deliver the power system.”
The same is true for software, he said. The interaction between the software and the system can quickly spiral out of control because there are so many possible permutations.
“With power optimization, you literally can have hundreds or thousands of power states,” he noted. “The challenge is understanding which power state something should be in, so what you do and what you need to do become much more complex. The software has to be a lot smarter about how to utilize these various power states.”
Building the right methodology
Unexpected use models are only part of the problem. In devices with a limited power budget, planning needs to start at the pre-architectural level. It has to be verified at multiple stages during the design flow and even post-silicon. And it has to be signed off during multiple stages throughout the process.
“Power has to be handled top-down for planning and bottom-up for validation,” said Bernard Murphy, chief technology officer at Atrenta. “Planning must be done on a virtual model or by extrapolating from earlier revs of the design. There are no power format standards or reliable modeling methods here, so this is largely a paper/spreadsheet exercise — mapping anticipated use cases to estimated power in each phase of a use case. What is important to note is that there is little connection (today) to RTL or gate-level power models. The most accurate methods work on incremental refinements of a design where you can extrapolate from silicon measurements for a previous rev, adjusting for process, frequency and voltage changes.”
Applying this methodology depends heavily on a very stable architecture. It doesn’t work well for new or rapidly changing architectures, where judgment and characterization experiments are necessary for new IP to cover the holes.
“Bottom-up validation has two components — Is the power intent I defined (in UPF) consistent with the RTL/gate-level/post implementation gate-level, and does estimated power modeled at this level correspond to my earlier planning experiments,” said Murphy. “The first problem is reasonably handled today. It is possible to check power intent consistency with the design from RTL down to the pg-netlist level, making this a true signoff step. Checking estimated power is more challenging, simply because you need to combine detailed-level power modeling accuracy (RTL/gate-level) with system-level activity. One first-order check is to compare IP-level power estimates with the values used in use-case modeling. These should at least be similar, though this is hardly a signoff check. A somewhat more accurate approach is to run at least some software on an emulator-based model of the design and use the trace dump to estimate power. There are methods available today to support this approach, but trace dumping all changes makes emulation very slow, limiting the amount of software you can power-characterize, again making this hardly a signoff check.”
David Hsu, director of marketing for low power static and formal verification at Synopsys, had a similar take. “To our view from a silicon perspective, you want to drive the power features of the silicon. That means power management techniques, and the interactions of all of those, and how they relate to each other. When you sign off each IP block, how do they interact? You have to do that from the very beginning. And from the bottom up, if you do have something that’s wrong, then signoff is not complete. If these problems ripple up, then certain modes may not be available to system designers, which means that if a wakeup mode didn’t work it will not ship with the chip.”
Where to start
There are different opinions about exactly where to begin the power signoff process, but there is agreement that all of them are moving earlier in the design flow. Power needs to be considered early on, and the more complex the design and the more power-sensitive, the earlier the starting point. Moroever, power characterization and constraints need to be communicated across multiple engineering teams in their own languages—sometimes literally, if it is a multi-country development process.
“There’s one company developing an infotainment chip that has six different chips integrated into one,” said Aveek Sarkar, vice president of product engineering and support at Ansys-Apache. “It includes software-controlled radios, so they have to worry about the noise of one radio to another, and it goes through a silicon substrate to analog radios, so you have to do substrate analysis, as well. And then you have software guys who work with the package and chip guys, so it’s definitely important to worry about whether the IP will work and whether it will work in context. For example, you have to make sure that when you ship IP you embed rules that you cannot drop the supply voltage to less than 10% of the supply it was designed for.”
The interaction between different IP blocks is another challenge. The more IP, and the more possible ways to use that IP, the greater the challenge.
“There is a lot of energy being spent around the characterization of the timing budget,” said Frank Ferro, senior director of product marketing at Rambus. “So in the DFI (DDR PHY) space, there’s a big issue about whether one IP block is playing nicely with another IP block. To guarantee that happens requires some effort between our company, for example, and the one providing memory controllers. Really what you’re dealing with here is characterization of the interface and glue logic.”
Getting that wrong can have a profound impact on power, but understanding it well enough to be able to actually sign off on the power aspect of this is extremely difficult—particularly with multiple power domains being switched on and off across a chip.
While advanced chipmakers such as Broadcom, Qualcomm, STMicroelectronics and Freescale have been wrestling with power issues for many generations—and have actually become quite good at it—the shift from 65nm as a mainstream node to 40nm and 28nm also has turned power signoff into a mainstream problem.
“There is a class of customers that is expert in power,” said Mark Milligan, vice president of marketing at Calypto. “The big changes we see are in the middle of the market. There is no deep expertise, and they’re starting to do power analysis. For all of those folks, power is suddenly very important and they need to learn this all very quickly.”
He said the real problems come when these companies make power modifications or the IP providers make dynamic changes for them. That throws off power budgeting, making power signoff much more difficult.
But even on the leading edge, that expertise doesn’t always overcome the challenges of multiple power domains and skyrocketing complexity.
“At the leading edge, they may have dozens of power domains,” said Jerry Zhao, product director at Cadence. “You have to analyze current, which is very sophisticated. And those same customers may not understand all the sequences. They may understand which ones could do the most damage to the chip, but there may be 25 power domains and they only analyze some of those. There’s a reason why they do this. In one power domain you may have a block with 5 million instances and 200,000 switches, and you need optimization to make sure the ramp up is not too bad.”
And while power signoff needs to be localized, for just this reason, it also needs to be done at the end of the design process at a full-chip level. “The boundary conditions might be different if it’s not a full chip,” said Zhao. And that’s a problem no one wants to wrestle with.