Trading Off Power And Performance


By Ann Steffora Mutschler
There is no shortage of opinions when it comes to the topic of performance and power tradeoffs. From abstracting the task from engineers to process considerations, engineering teams have a number of tools and approaches at their disposal to make the optimal design choices for their application.

Take the MCU application space for instance. Ken Dwyer, director of applications for the MCU business line at NXP explained that for its new family of ARM Cortex M0+ MCUs (LPC800) there is a target to reduce the active current from generation to generation on these parts. “We set targets that are somewhat realistic and aggressive and do a lot of work to get the microamps/megahertz down below 100. Our next target is 70 microamps/megahertz. It’s not just the core, it’s the analog, the SRAM, the flash and all that, and how they interact with each other on the system. There are always innovative ways to reduce it and we are always trying new techniques to make that happen.”

He pointed out the customer doesn’t necessarily see that in the user’s manual or in the register set that allows them to achieve this, because it is actually part of the nature of the silicon. “Some people might say, ‘You can use this register and reduce power even more.’ That is the case in some situations, but there’s also stuff done inside you cannot control that’s automated.”

For example, in the case of clock gating, which is well understood, NXP takes it up a notch, Dwyer explained. “When a peripheral is being used on an ARM system there is a thing called an AHB or APB—these are advanced IC buses and peripheral buses—and interconnect to them is well understood and well defined. Just turning off a clock to the peripheral is one way of saving power, but then if you have that peripheral being used that you can actively enable and disable while you are using it, you can save even more power.”

He acknowledged how complicated it is to trade off power and performance. As such, NXP has created ROM code called Power Profiles to make it easier for its users to achieve by allowing the users to select what they want to achieve: Lower current, highest performance, or a middle ground where it is the most efficient use of performance and power.

“We have APIs that allow the software engineer and the developer to take our parts and use these Power Profiles in that manner so one day in these products you might want the maximum performance for this device and they want to run at the highest MIPS performance, and there’s an API for that and it’s done automatically. And then if you want, on the same products, to have the lowest power consumption or even in the same application, you want to switch between performance (fastest code execution versus power) and you can switch back and forth between the two. The other way of doing that is to give a lengthy amount of registers and try to let the customer go at it themselves.”

On the process side of the equation, there are two aspects, noted David Jacquet, principal engineer at STMicroelectronics. “The first one is how much silicon area can I get. It’s clear that for any application, you have a constraint of cost so usually people have a silicon budget. And today what we know from a CPU or a GPU point of view is that if we increase the number of CPU thanks to DVFS (dynamic voltage and frequency scaling), you will have better energy efficiency but of course this increases the number of cores, meaning it adds silicon.”

As such, the first tradeoff people will do is a balancing act between the number of cores and what they target. “Once the decision is taken, and let’s say the decision is to go for quad core, your application environment is now very important to find the best tradeoff between leakage and dynamic power. You know that leakage is linked to temperature, but at the end what counts is the total power. You have to find a tradeoff between your leakage and the dynamic power. It means that for a given performance, you may say, ‘Okay, it’s good. I can reduce the supply voltage, but the implementation will have more leakage. But if I reduce my supply voltage, I reduce my dynamic power, so the total power might be lower.’ In other conditions, if the temperature is very high, your leakage might be significant and you can be in the situation where increasing the leakage at the expense of the lower supply voltage so lower dynamic power might become not interesting.”

This is something that must be chosen at the beginning because it’s totally linked to the implementation…except in the case of an FD-SOI (fully depleted silicon on insulator) process. “In FD-SOI, you can play with the body bias, meaning, you can play with the Vt of the transistor. It means depending on the temperature conditions you have to handle, you can have a fan or you have no fan, etc. You may have to do different tradeoffs between leakage and dynamic power unless it’s FD-SOI and body bias—you can really do that dynamically. It means that you can really take into account the real conditions your system has been designed for. Imagine you have a set- top box application. In some cases you may have a fan so the temperature will be lower so you can do more forward body bias and you can offer to your customer more performance. On the other hand, if you do a low-cost application you may have no fan, so in this case you will probably do less forward body bias because it will be more efficient.”

Giorgio Cesana, marketing and communications director for ST’s R&D department explained further, “With traditional bulk technology, you have to define precise specifications and the boundaries for the implementation up front. During the implementation and optimization of the cores, of course you have to account for the different margins that you need to be functional and with good reliability in all the conditions at the boundaries. With FD-SOI, we have this unique capability of playing with what we call body bias, which is the ability to modify the threshold voltage of the transistor dynamically. With that, and thanks to some sensors—we can have temperature sensors, we can have voltage sensors, or process sensors—we can modify dynamically the threshold voltage of the transistor and play with these tradeoffs to find the optimal point between static and dynamic power consumption so to be always in the optimal operating point for the core.”

And then there are those that don’t see a need to do much performance and power tradeoffs at all. Bernard Murphy, CTO at Atrenta asserted, “There really isn’t that much tradeoff required usually. If you look at the mobile market—smartphones and things of that ilk—there is plenty of performance. Power and functionality are really the big concerns. Also, you have a very heterogeneous kind of compute environment in one of those devices. You have the compute [tasks] distributed between lots of different CPUs and accelerators, so the classical CPU power versus performance tradeoff really gets washed out. And again, you’re not, at least today, doing intensive gaming on a smartphone. You’re much more running things that aren’t very performance-intensive so most of the focus is on power savings.”

The one caveat he gives to his viewpoint is that when looking at high-performance SoC designs, where power estimation is needed at the architectural level, it’s a challenging thing to do until after implementation. “Traditional power estimation looks at a physical-free view of a design and we are finding that that’s no longer a valid thing to do. Surprise, surprise—you have to look at the impact of placement and estimated interconnect even at the architectural level to get reasonable estimates of power. So that’s kind of a generic problem independent of the domain you’re designing for.”

At the end of the day, no matter the perspective, there is still a lot of innovation and work happening on both the design and process sides to address the challenges of leakage on lower geometries, balanced by the specific performance needs for specific applications.