Power Gating And Power-Centric Programing

Nothing is straightforward, and workarounds have power and performance penalties.


By Pallab Chatterjee
SoC design has a number of techniques for power management. One of the more prevalent methods is to use power gating to turn on and off blocks based on applications being run, and mode controls. Power gating while being supported by the two major EDA power design flows, UPF and CPF, still has some implementation challenges.

The flows have to make sure that the states of the logic at the interface to the blocks being turned off do not get corrupted due to changes on the shared ground/supplies. Basic power gating is well known. However, its use in both multiple power supply systems and multi-logic threshold systems still has some challenges. Power gating requires the outputs of the switched gates to be isolated from the control signals on the inputs, and also that the output get clamped at some state—low, high or “last value.”

The power gating function results in a reduction in the logic level swing due to the IR drop of the “on” device between the logic cell and the power supply/ground. Gate bias and level-shifting to a second set of power rails to drive the gate buffer control logic allows for the power gating devices to have a reduced IR drop to the virtual supply (VVDD). Timing construction for this type of function, however, is transparent to the UPF/CPF design flows.

A workaround for the logic that has to interface with power-controlled blocks is to use state retention registers. This solution has quite a bit of area/performance penalty as it requires a formal and powered-on register bank for each I/O-facing logic block in the sub-block. The gate count is expensive for full state coverage, and partial state coverage has validation issues. There is an additional cost of power and latency. The latency is due to the loading and unloading of the software state for save/restore.

To address these issues, designers can use an enhanced DFF with connection to the always-on retention register power supply. This cell would have to support save, hold, restore and normal operation functions. UPF and CPF do not always work directly with these non-RTL states and impact the validation flow. A further challenge is the functional planning and implementation of set and reset signals through the retention registers and the impact of those signals on the data being held for the “off” blocks.

ARM, in the Cortex M class products, has implemented low-cost state retention using sub-period clocks and secondary power supplies for the retention devices. These sub-period clocks allows Set and Reset functions to occur on an asynchronous basis with the system clock. The logic blocks are generally built using clocks from a DVFS control system.

The challenge for using these blocks is to not only integrate them into the timing flow of the circuits, but to make sure that the retention registers can safely provide data, at the correct logic level, with the blocks that are on. As application programs gain control of the power gating function, simple state machine-based control for these registers is not sufficient. Programming optimization of the high-level language function now have more interaction with the data flow per block. This results in environments such as OpenCl, which sends tasks to both distributed CPUs and GPUs through common and segmented memory controls, having a great deal of impact on when blocks are on or off. Normally, a compute task that has no output view is contained just in the CPU signal path, and the GPU can be powered down. Under OpenCL, it is possible to have this task sent to both the CPU and the many threads of the GPU and then combine the results in central memory. This has an impact on the power control, because to achieve the performance enhancement of the extra computation capability you cannot tolerate the latency of a turn-on, reset or restore, and then store and turn-off cycle of the GPU. This latency is typically longer than the compute cycle.

The design verification is still hampered by the fact that none of the logic verification environments can model these turn-on and turn-off state transitions as the power supplies change under application software control. The simulations are based on timing for the power supply control switch transitions, and estimates based on RC load for the blocks to be either available or not.