Clean Your Clock

Clock gating is low-hanging fruit, but you need to pay attention to efficiency metrics.

popularity

Lowering power consumption seems to be on every designer’s mind these days. And yet when asked about applying low-power design techniques, many engineers respond, “Well, we do clock gating … and that’s about it.”

Clock gating is low-hanging fruit when it comes to low-power design. Clock gating is also well automated, as witnessed by capabilities in modern logic synthesis tools. These synthesis tools do a fine job inserting clock gating cells at registers with enable conditions, where timing and other constraints allow insertions to take place. But is this really all there is to clock gating? No. It turns out that clock gating efficiency (CGE) is really the metric that should be examined.

Clock gating synthesis tools take a Register Transfer Language (RTL) description of the design and output a gate-level netlist representation of the design, complete with clock gating cells to shut down register clocks as a function of enable conditions. However, “not every enable is a good enable,” meaning clock enables can be inefficient in gating the clocks to the register. Simply judging the quality of clock gating from the clock gating coverage or “static CGE” (i.e. a percentage of registers in the design that have been clock gated), is not good enough. Dynamic CGE—a percentage of time that register clocks are actually shut off—is a much better indication of the quality of clock gating in the design. Nevertheless, obtaining an overall dynamic CGE number doesn’t tell designers where to focus their efforts on lowering the actual power consumption.

As shown in Figure 1, the dynamic enable efficiency is important, but what is also key is the amount of clock power downstream of the clock-gating cell.

 

ruby1

Fig. 1

There is interplay between dynamic CGE and downstream clock power. If dynamic CGE for a given enable is low, but the downstream clock power is also low, the effort spent to improve the efficiency of that enable condition will not result in significant power savings. If, on the other hand, dynamic CGE is low but the downstream clock power is high, that enable condition becomes the prime candidate to focus the design effort.

Most accurate analysis of clock power consumption can be performed only after the clock tree synthesis step (CTS) in the physical design flow. But by that time, it is usually too late to go back to the RTL design description in order to make changes to reduce power. Going through the synthesis and CTS flow to measure the effectiveness of RTL changes for reducing power with gate-level power analysis significantly increases the design schedule. Therefore, clock power consumption analysis with good accuracy is required early in the design process at RTL to help enable design engineers to be more productive and efficient in reducing power.

To achieve a high degree of accuracy, consistency, and predictability in clock power consumption analysis, RTL power tools must rely on physical models of the clock tree and wire capacitance. Equipped with accurate clock power consumption figures, designers can develop an actionable metric: a ratio of downstream clock power to CGE for each condition in the RTL that corresponds to the clock gating cell inserted during synthesis. This ratio highlights where more downstream clock power is driven by the clock gating cell with less efficient enable conditions, which means there is more potential opportunity to reduce overall power consumption.

A comprehensive report, such as the one shown in Figure 2 in spreadsheet format, allows designers to view key power reduction opportunities based the figure of merit defined above. Sorting the report by the figure of merit (highest value first), gives a prioritized list of these opportunities, complete with pointers to the RTL source file and line number.

 

Figure 2: Comprehensive Clock Gating Efficiency Report with Clock Power Consumption

Figure 2: Comprehensive Clock Gating Efficiency Report with Clock Power Consumption

In many modern designs, clock gating coverage already exceeds 95%, so adding enables to a few more registers brings small power savings. A comprehensive dynamic CGE report, including downstream clock power consumption for each clock enable condition, allows designers to address inefficiencies in clock gating and drive down the overall power of designs quickly—enabling you to ‘clean your clock’.



Leave a Reply


(Note: This name will be displayed publicly)