SoC complexity is making it more difficult to combine functional performance with demands for lower power.
Designers of large, advanced-node SoCs are grappling with a number of pressures in the quest to achieve the optimal performance and power of their designs. This has turned into a challenging balancing act between using less power, especially for consumer technologies, while also providing the same or greater performance and increased functionality.
Power and performance always have been competing demands in chip design. In fact, it’s not unusual to see processors or IP delivered to the market with lower power or better performance. But in the past this has been a tradeoff between one or the other, rather than providing both at the same time. But as more features are added to chips—some focused on low power or bursts of power, while others focus on minimum or maximum levels of performance—achieving balance is getting much more complicated.
There are now multiple voltages on chips and complex power management schemes. And designs increasingly involve a number of tradeoffs that involve everything from thermal issues to on-chip/off-chip throughput, signal integrity, and various types of noise. Even the package plays a growing and important role. So does the foundry process and the materials used for the substrate and the interconnects. Put simply, there are more constraints, more competing goals, and much more complexity on every level.
In this scenario, keeping track of voltages and making sure everything is working as planned—particularly with more complex and varied use cases—is becoming much more difficult.
“The dilemma for the designer is that to maximize the former optimization schemes used today is algorithmically treading an increasingly thinner line between robust operation and having failing devices within the field,” explained Stephen Crosher, CEO of Moortec. “This is especially challenging when considering dynamic and static IR drop within complex devices.”
More specifically from a technical perspective, voltage supplies have been coming down, quicker than the threshold voltages, which has led to less supply margin. “With interconnects becoming thinner and closer together, this is pushing up resistance as well as capacitance,” said Oliver King, Moortec’s CTO. “Compounding all of this is the usual increase in gate density seen with moving down the process nodes, which increases power per unit area.”
The effect of these issues reduces the margin between whether a design works or doesn’t work. And the solution is to push layout further forward in the design flow, particularly for power grids, rather than deal with it in verification at the tail end of the flow. That also requires more detailed analysis of power consumption and IR drop, he said.
This has driven the need for in-chip voltage supply monitoring to control the chip’s power scheme. “The speed of response for a Dynamic Voltage Scheme (DVS) needs to be such that PMIC control systems can react to supply droops and ‘events’ accordingly without data loss or corruption within the chip,” King said. “Further, there is more need than ever for high accuracy monitoring to enable fine-tune DVS schemes, optimizing power consumption against particular performance profiles. Of course, such monitoring and control schemes must be robust, as overall power control of the chip is at stake.”
Fig. 1: Tracking noise based on power analysis. Source: NXP/ANSYS
Server voltage
In a server system, voltage monitoring is important for proper functioning—but the level of importance increases with usage and with density. So if a chip is going to be used in a densely packed server rack inside a data center, it may be subject to more extreme operating conditions than a single server sitting on a shelf that is used intermittently.
Here, voltage is monitored to make sure the system voltage requirements remain within spec. For DIMMs, that could be as little as plus or minus 2% from the spec.
“In DDR4, the spec is 1.2V and this cannot vary beyond plus or minus 24 millivolts,” said Victor Cai, director, product marketing, memory and interfaces division at Rambus. “Close monitoring is required to make sure the power system stays within spec. Also, in newer systems, NVDIMM allows backup of DRAM data to flash if it detects possible power supply failure. So the system monitors the voltage in an event of power failure, either to initiate backup sequence or switch over to backup power.”
Keeping track
Interestingly, the power rail voltage or reference voltage supplied to an analog or digital circuit is a useful and simple indicator of the circuit’s health, according to Stephen Sunter, engineering director for mixed-signal DFT at Mentor, a Siemens Business. This reason is there are many potential fault types that cause excess supply current to flow, which can decrease the power rail voltage. Among them:
There are also other DC voltages of significance in circuits, as well, such as bias voltages for current mirrors, and mid-rail voltages used as analog ground, Sunter said.
Verifying it works
To account for this, there are commercial tools with voltage propagation technology that allow for a variety of voltage-dependent verification applications to be performed with relative ease, said Flint Yoder, technical marketing engineer for Calibre Design Solutions at Mentor.
“A few examples of the analysis that can be done with this technology are general electrical overstress (EOS), voltage-aware design rule checking (VA-DRC), and bias checking,” Yoder said. “These flows all get driven just by specifying the pin voltages on a design.” Certain commercial tools allow this to be done in early design stages on a source schematic, or in the later stages on a layout.
Still, the importance of traditional verification flows like LVS and DRC is well established, and having a failure in these categories can be catastrophic. Failures in verification applications based on monitoring voltage levels can have serious implications, as well.
“When monitoring voltages for EOS verification, it’s important to know that not just any device was used, but that the device has the proper ratings to handle the actual voltages in the design,” Yoder said. “The stress from improperly rated devices might not always cause immediate part failure, but it surely can lower the lifetime and overall quality of a product.”
It’s also important to extend the traditional DRC analysis by taking net voltages into account. “Without monitoring the voltage information it’s easy to be overly pessimistic with spacing requirements. However, when using a tool to include the context of what voltages are on each net, it’s possible to dynamically apply the correct spacing requirements between any two nets. This ensures that nets with a large voltage delta are given adequate spacing, while area is not lost due to stringent spacing constraints on nets with a lower voltage delta,” he added.
Voltage regulators
The most common voltage monitoring and management technique involves a voltage regulator, which is an increasingly sophisticated device.
“In the old days, those regulators could be a standard chip on the board that would tie to some of the pins of the SoC, then monitor the voltage and the supply, as well as monitor the voltage to the chip,” said Jerry Zhao, product management director at Cadence. “More advanced technologies put those voltage regulators on the same die, basically on the same design. These voltage regulators provide a stable voltage to the power grid.”
Of course the voltage supply is crucial for the chip function and timing, Zhao said. “If timing fails, the chip is going to fail, and the timing is closely related to the voltage supply because it varies. Different locations of the grid, different metals, will have different voltage drop, basically. Those voltages should be stable, and should not drop that much, and the monitoring technology has been developed to look at those to determine where on a grid the voltage drops. For instance, if a voltage drops more than 15%, it feeds back to the voltage regulator to tell the designer it needs to crank more so that the voltage will be sustained. That happens at the chip level.”
Accompanying the move to increasingly smaller technology nodes, the voltage is definitely dropping, Zhao said. “Ten or 20 years ago you could have 1.5 or 2.5 volts. Right now at 7/10nm, the supply voltage is getting down to sub-volts like 0.8 or 0.9 volt. At this level, the noise margin becomes very tight. It used to be, for example, that 100millivolts is less than 10%. Now 100mV is huge out of 800mV. That’s why voltage monitors are so important, and why it is important that they are well designed so that voltage drops will not be very big. Also, when the voltage does drop, there are certain technologies to sustain the voltage, like decoupling capacitance, or if an analog circuit voltage regulator has been implemented that can somehow provide certain control to that.”
“No matter what, design is complex today because of the load on the grid, and sometimes how to design these voltage regulators is challenging, particularly if multiple regulators are needed. The voltage regulators may need to go in multiple different locations. There may be separate power domain grids so there can be better control of the power supply system,” Zhao explained.
Best practices
Making an SoC design robust against voltage issues is a big-picture design issue, with multiple variables to control. “In terms of the power grid, it’s all related to how much power will be consumed, and where the power will be consumed, and the peak current,” Zhao said. “Those must be considered from the early stage of RTL power analysis. At the architectural level, you’ll want to implement low power design from the outset. When you do the implementation, this is where the physical design comes in. And you need to consider the clock controls, because clocks are the most active signals as well as being power-hungry. How are you going to do the physical design so that the IR drop and power consumption is better distributed? You want to avoid hotspots.”
This is where voltage monitoring comes in, which is why design architects include multiple voltage regulators on an SoC. That provides another dimension of challenge for the design, even on the power signoff side because when the power is controlled by the voltage regulator, a concurrent analysis must be done of the power grids along with the voltage regulator.
The power grids by themselves are very large, but they’re also linear, because it’s just R and C to be extracted. There is no active device on it. However, the voltage regulator is a very sensitive analog circuit, which requires SPICE-level simulation to analyze it. When you do the analysis on the power supply system, all of a sudden it becomes a concurrent analysis issue. It isn’t just a single metal grid because it has a sensitive analog circuit on it, as well.
“When you do this, you want to analyze it with SPICE-level simulation on the voltage regular, and you also need to run a very large scale digital type of matrix solving R&C analysis of the grids. That’s the challenge design teams need to consider,” Zhao said.
Voltages can be monitored by an ADC, as well. “For circuits without an ADC, it is common to monitor the output of a reference voltage output using the output of a second reference,” said Mentor’s Sunter. “If the first reference is derived from an NPN bipolar transistor’s bandgap voltage, the second reference would be derived from a PNP transistor to reduce the likelihood of a failure common to both bandgap circuits. A basic comparator is commonly used to compare voltages, and its two inputs can be periodically reversed by a clock signal to check that the comparator’s digital output switches synchronously with the clock, to facilitate built-in test of the comparator.”
Alternatively, Moortec’s Crosher said that the best way to deal with these complex issues is through finer-grain and more responsive in-chip supply and voltage monitoring. “The designer is able to develop higher performance optimization schemes, whilst ensuring devices and hence product, remains operational within its application.”
Related Stories
Managing Voltage Drop At 10/7nm
Building a power delivery network with the low implementation overhead becomes more problematic at advanced nodes.
Lots Of Little Knobs For Power
A growing list of small changes will be required to keep power, heat, and noise under control at 10/7nm and beyond.
New Power Concerns At 10/7nm
Dynamic, thermal, packaging and electromagnetic effects grow, and so do the interactions between all of them.
Leave a Reply