中文 English

Power Domain Implementation Challenges Escalate

More power domains are adding to chip complexity. Doing more throughout the design flow can help to limit power and avoid debug issues.

popularity

The number power domains is rising as chip architects build finer-grained control into chips and systems, adding significantly to the complexity of the overall design effort.

Different power domains are an essential ingredient in partitioning of different functions. This approach allows different chips in a package, and different blocks in an SoC, to continue running with just enough power to meet their needs, and to be turned off whenever possible. But it’s becoming more complex from the outset of the design process, starting with a determination of how many power domains are needed.

“It is harder than ever to determine how many power domains are needed in order to hit the power budget,” said Preeti Gupta, head of PowerArtist product management at Ansys. “For mobile handheld devices, where we need to be especially mindful of conserving power because they are operating out of a battery compared to a wall power device, there is a lot more complexity in terms of power domain implementation. Today’s leading mobile devices contain hundreds of power domains, so the overhead is by no means, trivial. One vice president of engineering recently said it would be great if, early on in the design flow, there was a good mechanism to estimate how many power domains were really needed in order to hit the power budget.”

The primary objective of using multiple power domains is to save power. “You try to divide your design into different regions where some of the regions may have a different voltage supply level, or can individually be shut down, powered down, or powered up,” said Henry Chang, senior director of product management at Siemens EDA. “If some of the logic is not needed at a certain time, it can be shut down to save power. The first thing the physical designer needs to do to implement multiple power domains is, in addition to the logic design netlist, provide the power intent in CPF or UPF. For example, UPF is used to define which portion of the design should be in power domain one, another definition will indicate what will be in switch power domain two, with the rest of it in always-on power. Once that is defined, that must be reflected in the physical domain, as well. That’s the first step — mapping from the power intent into the physical domain.”

The next step is reserving space in each power domain for a power switch cell, which will receive a signal to turn it on, or to shut down the power. “The power domain cell cannot be just one cell, because one cell would not be enough to drive a large power domain,” Chang said. “There are several ways to design this, and it is up to the architect or designer to define what’s best for the design. They may decide to put all of the power domains at an edge, or they may place them like a checkerboard throughout the design. They need to make sure there are enough power switching cells to drive all the power when it is on.”

Once the power domains are created, a number of issues can occur. “Let’s say the power is turning on from being shut off,” said Ansys’ Gupta. “The power of the device was lower, and now a power domain is turning on. Suddenly, it’s pulling a lot of current from the supply, because all of these devices that are turning on are drawing current from the supply. When those scenarios happen, it can lead to a high surge of current called di/dt. Abstracted up, what happens is the chips are sitting within a package, and the packages are dominated by inductance. So the inductance surges of current combine to create a significant droop in the voltage supply, which then can lead to a timing failure. This is just one example of a scenario that needs to be accounted for in the design. Otherwise, if you had a regular design that did not have power domains, you still would have changes in the amount of current the device needs, depending on the activity of the design. But when there are these large power domains, it’s like a mega button that you’re turning on and off. It leads to a bigger step function. This also means that if there is a big current surge within a block, the voltage must stabilize before the block can be used reliably, which impacts performance when the block is switching on and off.”


Fig. 1: Power integrity mapping in a complex chip. Source: Ansys

Making tradeoffs
There are latency aspects associated with this activity. That needs to be managed with power gating switches, which are used along with sizing and placement. All of this must be sorted out during the physical design flow.

“If you are trying to switch on the entire power domain, and the power domain has 100 power switching cells, if all of these cells are switched on at the same time, they start to power up the entire region of the logic,” said Siemens EDA’s Chang. “This leads to inrush current, which means that instantly, all the current is coming into this small power domain. Then, because of IR drop, other logic may get disturbed. The most common method used to protect against this is daisy chaining, which means the power enabling signal is propagated sequentially through the power switching cells so they don’t turn on at the same time. There’s a bit of a gap so that you can keep the peak voltage drop to a minimum. What’s best for a particular design must be determined at the architectural level. Those are the power switching cells. Once you finish the power switching cells, then at the boundary of your power domain, you need to consider two things. One is, what if your power domain is 1.3 volts, but the rest of it is 1.5 volts? Then, to change the power voltage level, you will need to insert a voltage shifter. The 1.5 volts will go to the level shifter, which will convert it into 1.3 volts so that both sides can operate correctly.”

When the power domain is shut down, the output signal of the power domain instantly becomes X, meaning it is undecided. It’s not ‘logic on’ or ‘logic off.’

“Basically, you don’t know, and that potentially will create a problem for the design,” Chang said. “Here, an isolation cell is needed, which is part of the power intent specification, part of the UPF. In the UPF, you need to tell the tool which cell, when it is shut down, which signal should be stuck to one, and which should be stuck to zero. Then you can provide this predictable logic level to the rest of the design.”

Another important consideration in implementing multiple power domains is the use of always-on buffers and power islands. “The design architect needs to do a lot of experimenting to find the best way to implement these,” Chang said. “There are two ways to do it. One is creating power islands, and the other is to allow an always-on buffer inside the power domain to retain flops. Each approach has pros and cons, so that is something important for the designer to pay attention to. Generally, if the routing resource is limited, then you would probably use a power island. If your timing is more important, then you probably want to use an always-on buffer.”

“Multiple power domains also mean different blocks could be operating at different voltages and can be shut down, as well, such that if certain blocks are inactive in a particular mode of operation, they can be powered down completely,” said Godwin Maben, a Synopsys fellow. “Some of the questions designers and chip architects must ask include whether the voltages are fixed. Also, are these voltages dynamically changed, based on the mode through software and internal voltage regulator (DVFS)? Or are they changed through internal hardware via PVT monitors or functional monitors (AVFS)?”

The methodology that needs to be finalized for all of this takes into account a number of issues, according to Maben. Among them:

  • How to define the sign-off corner for each discrete voltage;
  • Whether there are libraries characterized at these voltages;
  • How many clock cycles there are for voltage change and voltage/PLL locking;
  • Is the design architected from a physical perspective to handle multiple voltage islands;
  • Are level shifters available to accommodate communications between blocks, and
  • How to build a PG grid to accommodate various physical aspects, such as isolation between domains, along with nwell isolation, nwell voltage, etc.

There are significant considerations to powering down blocks.

“Should you turn off power or ground? Typically, you’d turn off power for better power integrity and wakeup time,” Maben said. “Internally, turning off power/ground is done through NMOS/PMOS switches, or is externally power-gated.”

Here, it is important to determine the wake-up time requirement, whether there are multiple sleep modes such as hibernate, snore, deep sleep, half asleep, etc. There may be as many as 10 sleep modes. Also, will the nwell/pwell be kept always on? If nwell is turned off, one needs to look at merged/split nwell physical implementation. This requires special cells like insulated buffer/inverter/LS, etc. One also needs to ask how to verify power-up/power-down sequences at RTL/Final GLS. Is LEC good enough, or is GLS good enough for signoff? What’s the offset in terms of leakage due to the added power switch, because power switches are very leaky?

Physically powering up always-on cells is very challenging, Maben noted. The physical methodology that needs to be finalized includes gas stations, AON buffers/inverters, and disjointed AON islands. Minimal IR drop must be ensured, which could include secondary PG grid design and routing. There may be physical handling of wakeup time requirements, whereby power switches are connected in daisy chain, fish bone, high-fanout, and low-latency configurations.

The physical methodology also must include rush-current analysis and methodology, with single, dual, or zero-pin retention registers.

Preeti Jain, product marketing manager for IC Compiler II at Synopsys, noted there are other considerations, as well, and designers should pay attention to design partitioning, critical path timing, multi-corner/multi-mode optimization with voltage scaling, design planning, power planning, and clock tree synthesis.

“Design partitioning can have a significant impact on the overall performance of the design, as new interfaces that are introduced during partitioning may contain isolation cells and level shifters,” Jain said. “By maintaining a strong correlation between the logical hierarchy and power domain structure, the number of signals that cross the power domain interfaces can be minimized. This minimizes the amount of logic to be introduced for maintaining electrical integrity, and as a result, minimizes QoR impact to the design.”

When it comes to critical path timing, investing time upfront to carefully architect power domains helps reduce the impact on physical design and system timing. UPF is a global standard for defining power intent specification. Ensuring that critical path timing, especially around power domain interfaces, is key, she said.

Next, designs need to be optimized and analyzed for each multiple corner and multiple mode scenario. “Customers are reporting that identification of worst-case corners is now an important and integral part of their implementation process. Today’s advanced implementation tools use the scenario specifications as constraint corners for optimization. A balanced approach to get good QoR while handling the runtime from multiple scenarios is important when using the MCMM techniques,” Jain said.

Jain noted that an increasing number of users are relying on automated tool flows because efficient SoC design planning is becoming a highly critical step for the implementation of a multiple-power domain design. She said several factors, such as added delay coming from level shifters and isolation cells, increased IR drop in power mesh, and increased congestion due to switch cells and retention flops needs to be carefully accounted for as it can result in the degradation of QoR. Power intent needs to be overlaid on the physical design process to avoid limiting the resulting QoR or placing any restrictions on the placement optimization process.

Multiple power domains require careful and detailed power grid planning because the power grids can become very complex. “A separate power rail structure is required for each power domain,” she said. “These additional power rails introduce different levels of IR drop, putting a limit on the achievable power efficiency. Every power rail also requires on-chip power distribution that costs area and further complicates power planning and physical floor planning. Advanced EDA tools provide automated power network synthesis (PNS) capability for the distribution of power across a design to help with the implementation of a multi-voltage design.”

Additionally, a clock tree in a design plays a significant role in overall power consumption. “Clock gating and minimizing clock tree insertion delays mitigate the effects of clock tree power. Multi-voltage designs pose additional limitations on clock trees to meet both low-power and high-performance requirements. Skew management becomes very difficult if clock path buffering and data path buffering are not well balanced,” she said.

Holistic approach required
Implement multiple power domains isn’t just confined to physical implementation. It needs to be integrated throughout the design process.

“There needs to be a holistic approach,” said Aakash Jani, technical marketing manager at Movellus. “If you don’t take a holistic approach during the architectural phase, it does lead to redundancy in the clock networks. It could lead to redundancy, it could lead to increased power penalties, and it could lead to higher on-chip variation penalties, which gets compensated with slower clock speeds. That, in turn, means less performance or less usable clock period.”

As the architects start to envision their product, and start to decide on using multiple power states, Jani said they must understand how much of the design is going to reside in each of the different power states/power domains. “By understanding the design will spend 95% of the time in one power domain and only 15% in this other, the architect/designers can start to pull the different levers of having always-on blocks versus having replications in the clock trunks and clock gating that. There are different kinds of levers that can be pulled as the power state is defined, and as the clock network is co-designed with that.”

For example, with different power domains there are different clock domains, he said. “As you have timing signals and data paths, crossing through each of these paths is going to create cross-domain clocking paths, which start to maximize the on-chip variation penalty. And as the on-chip variation increases, this starts to eat up the usable clock cycle and eats away at what can be done for performance, or what you do for dynamic power consumption. On top of that, if you have, say, five power domains, if you don’t co-design this in the architectural phase, the physical design engineer may be limited. And with five power domains, they also may create five clock domains, which creates a lot of redundancy as they start to deliver the clock network to each different power domain.”

The final piece of this is to look at the power penalty, Jani said. “If you figure out where you can use an always-on domain, versus having to power gate it, then depending on the actual power consumption and how much the user is actually spending time in that power state and power domain, you also could start to figure out how to plan the always-on domains and design the clocking structures around that. The whole idea is to take the clock network, and co-design it with the power domain in order to minimize the dynamic power from the clock consumption.”

Early analysis is vital, and UPF is a key element of that.

“UPF has been a great enabler in terms of specifying and allowing a ‘what if’ analysis in early power tools,” said Ansys’ Gupta. “With the UPF constraint file you can specify, ‘I want this block to use this control signal to turn on and turn off.’ So you’re able to create a prototype scenario, simulate that scenario, and come up with the power number that it would be. It gives you a tool to play around with and get an early idea of what the bang for the buck is when implementing all of these power domains.”

There is a lot going on here. Memories, for example, have their own separate power domain. “It’s getting really fine-grained power gating, particularly in the devices,” Gupta said. “A few years back, when it wasn’t as complex, chip architects would use their experience to determine how many power domains should be implemented and the blocks that needed to be power gated. But the complexity has become so great that you cannot do it without automation. There were tools that allowed designers to specify these constraints, but they had their own language. UPF is a universal language that enables designers to do the ‘what if’ prototype analysis up front.”

Frank Schirrmeister, senior group director for solutions and ecosystems at Cadence, agreed. “For chip designers and chip architects, the cost of multiple power domains extends well beyond physical implementation. With often hundreds of domains, teams need to carefully assess UPF verification in simulation and emulation first. The cost of these setups can be substantial. They also need to consider communication aspects between power domains and its verification.”

Conclusion
In combination with the effects teams see in dynamic power analysis, influenced by software controlling the various power domains, chip architects sometimes have to consider the ROI of adding power domains versus the required verification effort as part of balancing the overall power envelope, Schirrmeister said.

Siemens EDA’s Chang noted that the most time-consuming activity happens when the power spec is not complete. “If a mistake is made in the power intent file, at the end of the design it will take much longer to debug.”

At the end of the day, while implementing multiple power domains could be physically challenging, these problems are being addressed. “There are thousands of chips out there with working silicon since 90nm, and there is significantly mature and proven technology to address high-performance/low-power needs for today’s designs,” Maben said.



Leave a Reply


(Note: This name will be displayed publicly)