Power Delivery Challenged By Data Center Architectures

More powerful servers are required to crunch more data, but getting power to those servers is becoming complicated.

popularity

Processor and data center architectures are changing in response to the higher voltage needs of servers running AI and large language models (LLMs).

At one time, servers drew a few hundred watts for operation. But over the past few decades that has changed drastically due to a massive increase in the amount of data that needs to be processed and user demands to do it more quickly. NVIDIA’s Grace Blackwell chips consume 5 to 6 kilowatts, which is which is about 10X the power consumption of past servers.

Power is the product of voltage times current. “So if I need 5 kilowatts, I could do it at the standard voltage of 120 volts,” said Steven Woo, fellow and distinguished inventor at Rambus. “But I would need 40 amps of current, which is a lot of current.”

This is akin to the kind of wire you might buy at a hardware store. “There are many different diameters where the super high current wire is really thick,” Woo said. “Back in the day when everybody was thinking servers were maybe 1 or 2 kilowatts, for 120 volts, all you had to do was make a supply 10 amps. Now, with the power needs being much higher, if I kept the voltage at 120 volts, I’d have to supply four times the current or even higher, but the wires can’t handle that much current. They’ll melt.”

If raising the current is not an option, the other choice is to raise the voltage. “The product of the current times the voltage has to equal 5 kilowatts,” Woo noted. “These days 48 volts are coming into the server, whereas it used to be 12 volts. Now that NVIDIA is talking about 48 volts, they’re quadrupling the voltage, which allows them to quadruple the power while keeping the current the same.”

This change is reflected in the power supply. “We are seeing customers building large data centers as they pursue different paths to delivering the power they need to run their rack-based systems,” noted Rod Dudzinski, market development manager, embedded board systems division at Siemens EDA. “Some data center companies are borrowing ideas and concepts from high-performance power modules and related power electronics to accomplish this, such as efficient power conversion to thermal to lifetime reliability. And with the projected 50% increase in the power footprint of a traditional data center by 2025, board-level power conversion efficiency and power density are of primary importance for a system architect to consider as a means to reduce losses in the power distribution network (PDN) for each PCB in the system.”

Similar changes are reflected in EDA. There is also a parallel between what’s happening with power in the data center and with what is happening in EDA, according to Lee Vick, vice president of strategic marketing at Movellus. “In the world of chip design, it used to be the case that we went from individually crafting transistors through a process of EDA tools, but they were a series of different tools — placement tools, timing tools, routing tools. Eventually we had to move to a world where those tools are integrated, where the flows are integrated, and the data is integrated to meet the performance demands of the modern world. Now they’ve gone from a case where even the EDA companies don’t stop at design because you have to manage the lifecycle of that chip, from design through test and manufacture all the way out into the field, where they’re instrumenting devices and capturing telemetry data to feed back into the design process and improve testing. It’s an entire lifecycle. It’s a fully integrated vertical flow (even though it’s horizontal on the time frame), and that’s critical.”

A similar trend can be applied to power in the data center. “It used to be that when you were designing a chip, you had a power budget,” Vick said. “Or if you’re an engineer, and you were given a block to design, you had a power budget for that particular block and you didn’t dare stray outside of it. But that’s all you needed to care about — inputs and outputs. That’s no longer the case. In the data center, we’re seeing demands move well beyond just subsets or chips, to the board, to the rack, to the data center level. When you’re talking about demands for energy that are meaningful at a global scale, it’s time to take all that into effect.”

The ripple effects are important here, and it’s not just about having to minimize power. “Everyone has to minimize power,” he said. “There are constraints, there are demands, there are changes that are happening, and you have to be able to respond to them. Another critical thing is that we’ve moved well beyond the hypothetical, beyond the hyperbolic of, ‘This is something that’s way off in the future.’ At the most recent DAC, we had a panel, and it was all about managing kilowatt power budgets. We had industry experts from IC design and EDA and IP and system design. All of those pieces come into play. This is not a problem that the IP providers or the chip designers or the EDA companies can solve [individually]. It’s going to take everyone to solve. Similarly, in the data center we have to improve the distribution and cooling, which just adds energy consumption at the macro level. But that’s just exacerbated by the scale of the enormous number of chips and compute elements inside chips and chiplets of a modern data center.”

Ashutosh Srivastava, principal application engineer at Ansys, sees this going both ways in that chip design is causing a surge in power consumption, because the latest AI chips (including GPUs) are consuming more energy for much larger and faster computations. In some cases this is more than 2 kilowatts per server. “At the same time, chip architects also are looking to design a chip to optimize power consumption without compromising performance, since they will be costlier to operate — not just from the power cost, but the cooling infrastructure.”

In addition, the upstream power distribution in data centers is changing to accommodate the larger power needs, which includes changing the distributed bus voltages in the racks to 48v from the old 12V. Srivastava said, “By increasing this voltage by 4X, it reduces the currents by 4X and the conductive losses by 16X. Each converter in the rack is also being redesigned for higher efficiency. The power loss associated with the direct power to the chips is optimized by the converter’s placement. Stacking the power supply for the chip directly on top helps to reduce this power loss.”

New data center considerations
Another important consideration of the data center’s design is its location. “Usually, these are urban areas, and thus make a data center that is not very power efficient — competing with power requirements of the population could limit its capacity,” Srivastava said. “Due to this, some areas have prohibited the building of new data centers and, if push comes to shove, the data centers will be required to reduce their power loading to allow power for other vital areas in the communities. This means either creating compute hardware that is energy-efficient, or looking for an alternative power supply. This is leading to another trend where large data centers are now considering building their own power plants to provide the required power, especially from sustainable and reliable sources. This could take shape of traditional solar or wind with storage, or even under development, small modular nuclear reactors (SMRs).”

Managing power in data centers is an evolving challenge. “IT loads can fluctuate considerably throughout the day, influenced by the demands of various applications,” said Mark Fenton, product engineering director at Cadence. “A cabinet’s power is a complex set of changing variables — their current power usage, budgeted capacity for future projects, and maximum design limits. In turn, power distribution and capacity can be shared among multiple data centers.”

In a co-location environment, for example, users continuously adjust their demands on a shared system, with little insight into what IT is installed or coming. “New GPU workloads exhibit distinct power behaviors, often resulting in substantial and nearly instantaneous power spikes,” Fenton said. “These fluctuations pose a significant risk of failure for data center power infrastructure, which is a major concern. To optimize efficiency and maximize available power, utilizing three-phase power is beneficial. But it’s crucial to balance the phases to prevent inefficiencies.”

Power losses in voltage conversion
Converting voltages in data centers involves multiple stages of transformation and conditioning, which can lead to significant power losses. “If I’ve now got 48 volts coming into my server, the problem is the chips themselves still want to operate on 12 volts or 5 volts, or even 1 volt,” said Rambus’ Woo. “This means the voltage has to be stepped down. But every time you step down you lose some power, so the efficiency starts to go down. This is because it takes power to convert voltage levels, so that’s a big concern. There’s a lot of power being spent converting different voltages.”

This means data center infrastructure must convert building utility supplies into single- or three-phase power supplies at the rack level. “The voltage may reduce from 13.8 kV (medium voltage) to either 480 V or 208 V (low voltage), and subsequently down to 240 V or 120 V,” Fenton said. “Inefficiencies at partial load tend to be greater, and since most power is supplied with 2N redundancy, a significant portion of the system operates under these partial load conditions.”

Steve Chwirka, senior application engineer at Ansys, noted that the losses began with the large transformers that step the utility power down from the 480V AC levels. “This new lower AC voltage is distributed through multiple types of cabling and through PDUs (power distribution units), which are basically very large bus bars. All of this contributes to conductive losses in the system. There are several levels of power conversion that have associated power losses, as well. These include the uninterruptable power supply (UPS) that supplies power to the racks under fault conditions just long enough for backup generators to kick in. The major conversions happen at the racks where the AC voltage is converted to a high-voltage DC, then to lower DC voltages via a power supply unit (PSU). This DC voltage now goes through a couple more levels of conversion until it gets to the actual chips.”

At each level, the amount of power loss is different. Going from utility input to chip, Chwirka gave some estimates of the power loss. “Power transformers are very efficient pieces of machines with only 1% to 2% loss. UPS systems can vary in efficiency based on their design and load conditions. Online UPS systems, which provide the highest level of protection, typically have efficiencies between 90% and 95%. Consequently, they can lose between 5% and 10% of the power they handle. PDUs also have some inherent losses. These can account for about 1% to 2% additional losses. Modern PSUs typically have efficiencies ranging from 80% to 95%. This means that 5% to 20% of the power can be lost during conversion from AC to DC. Additional converters, sometimes called Intermediary Bus Converters (IBS), convert the rack’s 48 V DC down to 8 to 12 V DC, and can have very high efficiencies in the range of 98%. Final conversion for the low voltage required by the chips is somewhat less efficient than the IBC due to size limitations.”

What to know about power delivery
With so many considerations to take into account when designing into data center environments, one of the most important things to watch is the infrastructure around higher voltages. “If higher voltages are coming into the system, you’re looking for how that voltage is going to get stepped down to whatever it is you need,” Woo noted. “It’s probably going to be some external circuitry that’s doing the step down. There are on-die ways to do some voltage management across small ranges of voltages. The most important thing is to really understand how much power your chip is going to consume and understand where that power is coming from. This is typically more of a system level problem. There are also questions about aging issues, as sometimes what happens when the chip heats up is it expands. The different materials that make a chip all expand at different rates, and that can lead to cracking and other reliability issues if you’re thermal cycling, which is going between a high temperature and a low temperature very frequently.”

There are also architecture impacts. Norman Chang, a fellow at Ansys, explained that with 3D-IC chipsets getting larger and larger, chip architects need to consider distributing power supply vertically to the chip sets, such as the design of the power supply system in the Tesla D1 Dojo chip. “Architects also need to consider the thermal profile, given the placement of tens of chiplets in 3D-ICs through system technology co-optimization,” he said. “Analog/mixed-signal designs in the 3D-IC need to be placed in locations that are less sensitive to thermal/stress variations from peak computational workloads.

In the end, the challenges with data center power delivery will fall into the purview of chip and system architects. “As a computer architect by background, I was very digital-, very processor-focused,” Movellus’ Vick said. “Then I started working for hard IP companies, and they would say things like, ‘How many bumps do you have for supplying power?’ And I would say, ‘I don’t know. Power is just there. It’s always clean, and you don’t have to worry about it.’ But things like implementation and integration matter — how clean your supply is, and how you route power. One of the things that we’re seeing at the architectural level is that when you’re integrating analog portions of circuitry, be it power regulation or sensors or clocking, the simple fact that you have to run analog voltages into areas that are traditionally digital can wreak havoc with your design. Let’s say I have a large block of digital logic that’s consuming a lot of energy. I want to see what’s happening on the grid. I want to see if I’m experiencing droops. But you want me to jam an analog sensor right in the middle of all this digital sea of gates. That’s very difficult to do.”

Migrating analog to digital designs gives more freedom to do more instrumentation and understand what’s happening. “That’s an example of something that is beyond the functionality of your block,” Vick said. “Oh, sure, it has a lot to do with the implementation, so we are moving from the esoteric into the real world, and the real world implementation matters. It’s not about whether I can architect this thing, or if I can get the best TOPS/watt number. Can I physically implement it in a real design? Can I deal with noisy power supplies? Can I deal with the power grid that is no longer designed to be robust enough to take anything that I throw at it, because if you design that you’re not going to be competitive. The amount of margin and over-design required says I can’t afford to design that way anymore, which means that now my power grid itself is under the same kind of design constraints that my logic is. It’s riding that ragged edge, and there are going to be times when it’s going to have excursion, it’s going to struggle, and I have to think about that from a hardware and software perspective, as opposed to assuming infinite amounts of clean power.”

Related Reading
New Approaches Needed For Power Management
Limited power budgets, thermal issues, and increased demand to process more data faster are driving some novel solutions.
Data Center Thermal Management Improves
CFD, multiphysics, and digital twins play increasing role in addressing heat within and between server racks.



Leave a Reply


(Note: This name will be displayed publicly)