A Balancing Act

Fine-grain power management needs to be weighed against cost, competitiveness and the sheer difficulty of getting it right.


By Ann Steffora Mutschler
If you stay current on data center trends, you are well-versed on the fact that Intel reported last June energy proportionality has effectively doubled server efficiency and workload scaling beyond what Moore’s Law predicted.

What does this have to do with power management of SoCs? Cary Chin, director of marketing for low-power solutions at Synopsys, said that if you look at a very macro example of a data center it’s pretty clear what you are talking about—a 1,000X compute problem should take 1,000 times as much energy as 1, with zero throughput. When it comes to low-power SoC design, it’s clear that big strides have been made in the last 5 to 8 years, but what does it take to get to the idea of energy proportionality in an SoC?

“One of the things everyone involved in low power today would say is we’ve gotten pretty good at doing low-power in these big chunks,” he said. “The clearest example of what low power used to mean 10 years ago was that you would turn off the chip. That was really the one switch you had. You could apply power or you could not apply power, and that’s actually nice and proportional it turns out, but it doesn’t allow you to achieve anything beyond the whole chip is ‘on’ or ‘I’m not doing anything.’ “

This approach—following the story about how full a jar is, first with rocks, then pebbles, sand and water—is one of the big rocks.

Peter McGuinness, director of marketing for multimedia products at Imagination Technologies, understands this keenly. “If you are filling a jar with materials then you can fill it with big rocks and you can then fill up all the gaps. But you could also go ahead and just fill it up with sand. However, in power engineering, and specifically with SoCs, that doesn’t work. You can’t achieve the maximum power efficiency just by going for fine-grain power management. You have to start with the big rocks, which are the architectural issues and the design issues that start before you’ve even begun the physical design of the chip.”

In an SoC environment, the scarce resources are memory bandwidth and the thermal envelope that you’re working within—those are the two big things that need to be addressed first, he said. “Managing the memory bandwidth utilization is the place that we generally start when architecting something.”

For example, in Imagination’s GPU design, a technique called tile-based rendering is used, which minimizes memory bandwidth utilization by bringing the majority of the rendering process to on-chip memory. That is the big rock, par excellence, because that cuts the power consumption enormously—both on die and also the power dissipated in the memory pins, McGuinness explained.

After that, “make sure that the arithmetic operations that you perform on chip are A) as efficiently used as possible. That means if you have a compute resource you make sure that it has the fewest possible idle cycles. And B) make sure that the arithmetic operations that you’re performing are actually appropriate to the task in hand,” he said.

Pete Hardee, low-power design solution marketing director at Cadence, agreed that the big rocks must be dealt with first. “It’s been a learning experience that you need to deal with those big rocks first because traditionally people haven’t done that. They’ve been relying on power optimization in the implementation tools, and then various gate sizing and other techniques come in. Multi-voltage threshold has to be able to use the minimum number of low-voltage threshold gates to be able to save on power. Then people started to look at clock gating and it became normal for synthesis tools to do a reasonable job of clock gating. If you look at those as the equivalent to maybe some of the water and the sand, clock gating is probably like the fine gravel. It’s only recently that we’ve really been thinking we need to do more than that. We need power domains with power islands with power gating and maybe some areas where we can apply dynamic voltage and frequency scaling. Those would be the bigger rocks. And once you reach the conclusion that you need those bigger rocks, you actually have to deal with those big rocks first. You can’t optimize those and decide that to do that you’ve got to design the architecture from the outset to be able to deal with that.”

Choosing the right technique
How do engineering teams choose the right fine-grain technique to use?

Chin said the best you can do is look at scenarios, run through verification and lots of simulation, and do as much power analysis as you can during these scenarios to try and figure it out. “It’s a hard problem. Gathering all the data, even in the case of relatively static behavior, is hard. And then when you throw on top of that the complexity of all the other stuff that’s going on, it’s a really tough problem.”

He said this is exactly where design engineers are struggling today. “Even if we could implement a finer-grain architecture, it’s not that we would know what to do. We have to kick around for a little while, gather some data and then say what are the issues that are going to come out that we need to avoid.”

Another consideration of using fine-grain power management is balancing the cost complexity against the real benefit, observed Gene Matter, senior applications manager at Docea Power, “It’s just like doing performance enhancement or doing anything to a product has some associated cost. Part of the cost is material design cost in terms of resources, and usually if you are up against a schedule budget you have resource pressure and you can’t completely verify a design. It’s not that the design itself is complex. It is the verification that will be in your critical path.”

So using fine-grain power techniques comes down to benefit and cost. “If the benefit is not significant enough, or maybe it is significant but you can’t charge more or it doesn’t give you a boost in some other area—there are a lot of things that cost besides material cost. One is you may do it just to be competitive. You may do it because it gives you a competitive, differentiated value. It’s something you can do and none of your customers can, which you can then build into a marketing story or technology leadership story. Then you have cost associated with headcount: How long is going to take to design this, what are the incremental resources for verification throughout the product lifecycle? Most times what squashes most fine-grain power techniques is, ‘We will design them in, we won’t guarantee them because we can’t verify them, and we will do verification afterwards,” he asserted.

Looking ahead
New techniques promise even more, Cadence’s Hardee said, such as the concept of voltage islands, also called multi-supply voltage (MSV), where power domains with different voltage supplies are used. “Our recommendation to people using that technique is pick a reasonable chunk of design that you can apply that to and don’t make things overly complex in terms of the number of power domains you are using because it increases the need for level shifters.”

But what if you could actually optimize at a much finer grain?

“You could basically have power domains everywhere and you could lower the voltage for each piece of circuitry—maybe down to each critical path,” Hardee said. “You could run only critical paths at a higher voltage and on less critical paths you could lower the voltage. This would break our rules for reduction of complexity. You’d keep the power domains to a nice separable piece of circuitry, a nice well-defined block in your design, apply these power domains concepts to that. If you could go finer grain with that then you would of course have a tradeoff between the power savings versus the extra components you’d need in there, i.e., more level shifters. But because at the end of the day dynamic power is proportional to V2, any reduction in the voltage we can get is a good thing. Yes, they do increase complexity, they are going to give pains for both verification and implementation, but when you’ve got customers who’ve tried all of the techniques, what do they do next to wring out the next few microwatts out of their power budget?”

Mark Dunn, VP engineering for IMGworks, the SoC development group at Imagination Technologies, agreed. “The reason that people are prepared to go to these lengths is because the ultimate performance of a mobile device is completely now dominated by the power envelope. So the main way to get a gain is to make the design more power-efficient so more performance can be gained from the same thermal limits.”

That seems to be a common viewpoint.

“When we look at where we are going to head from where we are today, we need to think long-term about where we want to go and then understand where we are, and introduce the right little pieces in the right spots. Today we are heading in the direction of more IP—the idea of better hierarchical handling of these things. Those are all great ways of extending low power to more and more complex designs, which is what we want to do,” Synopsys’ Chin concluded.