Energy Vs. Power: Energy, Power Optimization Is A System-Level Challenge

Last of three parts: Power issues are so pervasive that they touch the entire design team – and must be carefully managed as high in the design process as possible.

popularity

By Ann Steffora Mutschler
Power issues today, whether they are related to low power in a smart phone or highly efficient power for data center applications, are so pervasive that they touch the entire design team—and must be carefully prioritized at the system or architectural level.

As discussed in Part 1 of this series, energy and power are different entities and must be understood distinctly from each other. After that engineering teams can apply design techniques to optimize one or the other in a system. That was addressed in Part 2 of this series, including how the issues differ when optimizing for either power or energy efficiency.

For most engineering teams, the easiest place to start looking at either energy or power efficiency and prioritizing is on the hardware design side, because that is most familiar.
According to Chris Rowen, founder and CTO of Tensilica, engineering teams start the process of prioritizing by understanding what their constraints are. “Are they really interested in maximizing battery life or are they really, for example, working to stay within a thermal envelope? That will determine a lot of what they choose to do.”

Second, centrally important and increasingly well recognized by people designing battery-operated devices is they look at scenarios. “They say ‘I want to be able to play a full length video on a battery charge,’ or ‘I want to be able to listen for 100 hours to MP3s,’ or ‘I want to have this much talk time,’” he said. “Those all use, quite often, very different subsets of the hardware and run very different algorithms so you can’t lump them all together and say, ‘My chip dissipates 3 watts,’ or even, ‘My chip dissipates .3 mW per megahertz,’ because those are too coarse of a characterization. You have to say, when doing this task (decoding video, decoding audio, running the voice stack and wireless protocol, you have to characterize each of those things individually and therefore go through a fairly complete assessment.”

The assessment needs to determine which sections of the chip are active and what the flows are at the chip interface including the activity in the memories, and how much the power amplifiers are turned on. “All of those things will play a central role in assessing what the power is for the different scenarios. Anybody in the cell phone business or anybody who has battery life as a central concern is working on different scenarios and needs fairly detailed information,” Rowen noted.

These issues must be dealt with in both hardware and software, with a broad suite of techniques. From bottom to top, the first determinant is process technology—what the transistors are like, how leaky the process technology is, what the fundamental drive versus power is for the transistor family in question.

“Moore’s Law silicon scaling has helped enormously because we’ve been able to scale down voltage and scale down capacitance and scale up speed, so power and energy characteristics have been just terrific,” Rowen said. “That of course has slowed down. In particular, the leakage characteristics of these process technologies—that is, how much power is dissipated even when the transistors are off—becomes a lot worse. People have to work pretty hard on that question, and it has led to much more interest in, for example, power gating in these designs. When your scenario says that when you’re not using a particular subsystem within a chip that you not only stop doing work on it, you not only turn off all the clocks to it, but you actually remove the power to it, so it really does dissipate no power. It does mean that you have to plan ahead a little bit because restoring power to a subsystem typically takes more cycles than just turning the clocks back on. In the past you might have gone into standby. Now you go into hibernation.”

After process technology, the next level up in gross terms would be the logic design, the microarchitecture, and what’s going on cycle-by-cycle in some block. In the case of processors, they are particularly well understood in terms of their power characteristics and they do represent a significant fraction of the power. Processors and memory together tend to represent a lot of the power in many chips. Here, it is a function of how to get the work done with the smallest number of gates and shortest wires in the design.

Following logic design, the key tradeoffs at the next level up—architecture—include the sequence of operations and how algorithms will be mapped into the cycle-by-cycle computation.

“You definitely want architectures that can cope with a multitude of things,” noted Pete Hardee, director of solutions marketing at Cadence. “We are starting to see those kinds of differences in the applications that we have to run. The macro architecture decisions that drives are typically that a multiprocessing architecture is needed and often it’s heterogeneous multiprocessing. I need different kinds of processing engines to be able to cope with those two different things. So I’ve got a range of stuff and I may well have more processors in the platform than I need, but that’s okay because I can shut them down when they’re not needed so they are not burning energy. I need a range of processors, a heterogeneous multiprocessor macro architecture, which enables me to cope efficiently with these different processing tasks I have to do. That multi-processing architecture gives me some different configurations where I can process really fast or I can use parallelism to do that or I can take my time in different circumstances—do it all on one processing engine and take a little longer.”

This comes to bear on the memory architecture, as well, in terms of caching, he said. “Can I predict what algorithm I’m going to need next? Or can I predict what data I need next, and where is that coming from? This affects decisions like preemption and also the caching algorithm that I need. So those are very much processing and memory architecture decisions that affect how efficiently the software is going to run in those different cases.”

While the hardware side of the system is challenging, the software aspect of today’s complex systems is equally so.

Mark Mitchell, director of embedded tools at Mentor Graphics, related an example. “I was talking to one of our internal engineering teams. They had a customer that had tracked a problem down in a satellite. Because of a software problem on the satellite they were using the memory hierarchy inefficiently and that was generating so much heat it was causing faults on the satellite.”

This is where system-level design and integration become quite obvious—particularly when things go wrong. “There was a hardware platform that was within specs, but when you have this sort of thing that on the ground in a controlled lab environment you would have been just fine,” Mitchell said. “Out there in space you’ve got a physical effect that was no good. The problem was that because of the way the system was configured, the cache wasn’t being used efficiently. As a result, the main processor had to go out to get data from the external RAM a lot more often than it should have. So the program is working, everything is operating correctly, but you’re generating all this traffic. One of the interesting things is that memory uses up a lot of power, so by getting the data from the external memory all of a sudden the power consumption and heat levels on the thing were going up significantly.”

Knowing how to prevent future problems by anticipating them from the very start might sound like a good idea. But complexity can make it much harder to practice.

“I’m not sure if anybody had ever said to the team designing this particular piece of hardware, ‘Hey here’s what your thermal limits are—this is the maximum temperature you can reach.’ It’s possible no one even thought to ask that question,” Mitchell observed. “If you do get a requirement like that and pushing it down–—and to me the interesting thing is how it gets into the software—pushing it down to the software guys is really hard because software engineers don’t think about heat. They think about ones and zeros.”

He believes there are two problems here. One is adding some level of awareness among software engineers that they have to think about these things. The second involves ways of getting visibility into the system because you can’t see where is this heat coming from. Why is it hotter than you expect? These aren’t easy things to get intuition around.

To solve the satellite problem, Mentor’s Vista group in Israel built a virtual prototype of the board for the customer so they could get visibility into the system that isn’t possible on a satellite that’s actually flying. “Even on a ground model you would have a hard time because some of these things that you want to look at like the cache transactions aren’t exposed from real running hardware. You can’t see what’s going on, either in software or even with a physical probe connected to the device. So they were building software models that they could run that could communicate more information,” he said.

Still, at the end of the day the question of whether it is better to optimize for power and energy efficiency in software or hardware is not easily answered.

“The questions get very complicated and they really are application dependent. You can’t get to the right answer without understanding the application, the demands on the system, the overall performance requirements of the system both from timing performance and power performance to know what the right way is to do it,” said Jon McDonald, technical marketing engineer in Mentor’s design creation synthesis group. “A lot of people pick software just because they think software is simpler: ‘It’s easier; I can change it.’ But it’s going to take longer generally to do it in software and depending on what else the processor is doing. It may actually take more power to do it in software.”

To be sure, the white board, block diagram and spreadsheet system no longer works. Engineering teams today need a dynamic execution model that can run software with hardware, get some quantitative feedback on the performance and power requirements of the system that get to the energy of the system and make some decisions about the architecture before going into implementation.

“It’s better to do it before you’ve made the decision between software and hardware. It’s better to do it at the system architecture level with abstract representations of the functions so that you can model things before you decided if an algorithm is hardware or software. At the transaction level I can take an algorithm, a C or C++ function and I can compile that to a target processor or I can wrap that with a transaction-level interface. I don’t have to change that function at all and I can create a model that represents that function and accurately predicts the power and performance of that function running on a particular ISS or running as a hardware accelerator interacting with the rest of the system. By doing the analysis at that level, it’s not a hardware problem. It’s not a software problem. It’s a system problem,” McDonald concluded.