Efficiency is improving significantly, but the amount of data is growing faster.
Data centers have become significant consumers of energy. In order to deal with the proliferation of data centers and the servers within them, there is a big push to reduce the energy consumption of all data center components.
With all that effort, will data center power really come down? The answer is no, despite huge improvements in energy efficiency.
“Keeping data center power consumption flat in the 2020s will be more difficult than ever before,” said Dermot O’Driscoll, vice president of product solutions for the infrastructure line of business at Arm. “Some predictions have stated data center power consumption could grow by 2X to 7X without substantial innovation.”
Workloads will be increasing dramatically, driven by new infrastructure and overall global development. “The demand for digital services is being fueled by AI, 5G, and IoT, all while emerging economies are building out their digital infrastructure to help close the digital divide,” O’Driscoll said.
The expected power reductions are likely to keep power from increasing as quickly as the work being done within the data centers. That means more efficient processing, and it keeps power under control. But it’s not likely to reduce power outright. That raises the question as to where the additional needed energy can come from, and whether it can be de-carbonized.
This all comes in the context of dramatically increasing data. “Half of the data that’s been created in the world has been created in the past two years,” said Rich Goldman, director in the Electronics and Semiconductors business unit at Ansys. Presumably, we’re gathering all of that data so that we can do some work with it.
There is an enormous amount of effort going into ways to reduce the energy consumed by data centers. That puzzle has many pieces, and all of them are getting attention. Roughly speaking, the most prominent contributors include the servers themselves, the interconnect within and between data centers, and the cooling required to keep things from overheating.
Exactly how much each of these pieces contributes depends on the specific workloads being performed. Machine-learning training and bitcoin mining are two examples of particularly energy-intensive computing.
Given that efforts are underway to reduce all of these over the long run, one would think that the ongoing increases in data-center energy consumption could be arrested and then brought down. In the aggregate, that’s not likely to happen.
Improving energy efficiency
All of the energy-saving efforts serve one purpose — do more work for a given amount of energy. That’s obvious for computing. As long as the amount of work done for a unit of energy goes up, the efficiency goes up.
Computing efficiency definitely has increased. The challenge is that even though we can do the same work for less energy now, the net amount of work to be done is growing tremendously.
But the increases in efficiency that we’ve been able to implement so far have been very effective. While there are opinions suggesting that data center energy consumption is increasing at exponential rates, that may not be the case.
“Between 2010 and 2018, data center workloads increased more than sixfold,” said Frank Schirrmeister, senior group director, solutions and ecosystem at Cadence. “Internet traffic increased tenfold, and storage capacity rose by 25 times, while data-center energy consumption changes were minor, growing only 6% to 205 TWh.”
Overall, roughly 2% of global energy goes to data centers. “Total data center electricity use in 2020 is about 300 to 440 TWh,” said Priyank Shukla, staff product marketing manager at Synopsys. “That could power the whole country of Iran, and it does not include bitcoin mining.”
Schirrmeister said energy consumption needs to be viewed in context. “We actually need a different metric to define what we’re talking about, because the growth of communication, compute, and storage is much steeper than that of energy consumption.”
One metric used today is the power usage effectiveness (PUE), a gross measure of how much energy goes to overhead. It’s the total energy used divided by the amount used for computing. An ideal value of 1 means no overhead at all.
“Today, data centers are running at 1.08 to 1.12 PUE, meaning the total power burned in doing data processing is just 8% less than total power burn,” said Shukla.
Exactly what gets excluded as overhead is somewhat murky. One element that it covers is any loss in the power distribution system between wires entering the building and electricity flowing into a server.
“What it translates to us is we have to build more efficient power supplies,” said Rakesh Renganathan, director of marketing, power management, power and sensors solutions business unit at Infineon. “There is still innovation that can happen to get more efficient power closer to the processor.”
And yet, as more power is delivered to each server, we can’t increase the size of the power infrastructure. “Power supplies in data centers continue to push for higher power density, meaning more power in the same physical size,” said Anuraag Mohan, vice president of systems and applications engineering at Crocus. Higher-performance sensors are needed to manage the higher density and to ensure more efficient use of energy.
Cooling and lighting are also included as PUE overhead, although the cooling portion includes only overall building cooling, not server fans.
Interconnect
When measuring real work done, it’s computing that should dominate. Everything else happens in the service of computing. And one of the major power contributors is the cost of moving data around.
“Interconnect takes about 27% of total power, and processing takes about 20%,” said Shukla.
Others point to similar trends. “What we’re watching happen with this big power budget is more and more that goes to moving data all over the place,” observed Steven Woo, fellow and distinguished inventor at Rambus. “On-chip movement of data, which people don’t always think about, is consuming more power, as well.”
One way to reduce this power is to use networking switches with more ports — so-called higher-radix switches. That allows more switching within a single box, reducing the number of hops required to reach a more distant destination. Newer switches with 256 ports should help with this.
Fig. 1: Networking in the data center. By increasing the number of ports in switches, more connections can be made within a single box, meaning fewer hops up to the spine and back down again. Source: Synopsys
With all of these connections, SerDes power in particular has grown many times over. SerDes are part of two fundamental interconnect technologies. First, they’re intrinsic to PCIe. The physical wires that carry PCIe traffic use SerDes technology, and that aspect can benefit from the latest generation of PCIe. “If you move to PCIe 6, the industry expects you will save 30% of your power,” said Shukla.
SerDes technology is also needed for optical connections when using pluggable modules. These modules plug into the outside of a server, and the optical signals are converted to electrical signals there for final delivery to the CPU. That electrical “last foot” uses SerDes technology.
One way of reducing interconnect power is to move from PCIe electrical interconnect to more optical interconnect. While optical is typical for long-haul and is also growing for medium distances – such as between data centers or within a campus – it’s not yet being leveraged within buildings for relatively short distances.
But saving power while doing so may require more steps. The move to coherent modulation may counteract the electrical power savings due to the additional digital signal processing required. While power goes down for longer connections, it’s not yet clear whether that can happen for short hops.
In addition, power could be further reduced by running fiber all the way into the server using co-packaged optics, eliminating the SerDes leg of the connection.
Cooling
The other major power overhead in a data center is cooling. For bookkeeping purposes, it would appear that cooling is split into two buckets. Facilities cooling – overall air conditioning, for example, goes into the non-computing portion of the PUE. But servers are in the computing portion, and they have local fans or other infrastructure for cooling.
Cooling accounts for a huge portion of power, but it’s for something that does no work. Cooling simply keeps the computing hardware from burning up. “The percentage of all data center power that goes to cooling can vary, but I would say the average percentage is in the 40% to 45% range,” said Larry Kosch, director of product marketing at GRC.
In general, it’s easier to cool more widely distributed power than it is to cool a single blazing-hot chip. “It’s one thing to try to cool a 200-watt chip where the power is concentrated in a very small area,” said Woo. “It’s another thing if you could somehow distribute those 200 watts onto multiple chips. By distributing it, that makes it an easier problem from a cooling standpoint because the power density is much lower.”
Several ideas are playing out for reducing cooling, some of which involve liquid. “There are more chips now that are using liquid-based cooling, where they have a heatsink, for example, that’s touching the silicon, and that heatsink is hollow,” Woo continued. “What they do is to flow a liquid through that heatsink, and that will wick away the heat.” This requires additional infrastructure within a rack to deliver and recycle that liquid.
“There’s also more interest in what’s called immersion cooling, where you literally take your electronic boards, and you submerge them in an electrically inert fluid,” he added.
Liquids can be much more effective at moving heat. “Compute density per rack (100+kW per rack) with immersion easily exceeds that of the air-cooled racks (~15kW per rack) due to its ability to cool the information technology equipment (ITE) with high efficiency and minimal or no water losses,” said Kosch.
One immersion approach is referred to as single-phase immersion, relying solely on the liquid phase of the coolant. That coolant is circulated to facilitate cooling.
Another approach is called two-phase immersion, so named because the cooling happens both in the liquid and vapor phases of the coolant. Servers are immersed in the liquid, but the liquid boils at the surface of the chips, with the rising vapor acting as the heat transfer mechanism. The natural convection process is claimed to reduce the need for pumps and other infrastructure.
It’s unintuitive to surround electronics by liquid, but if the liquid is a dielectric, there will be no shorting. Given the right chemistry, there should be no corrosion or degradation.
“Although it is possible to immerse ITE designed to be air-cooled (with minor modifications such as turning off fans, adjusting firmware), the rapidly increasing rate of adoption has generated enough demand where ITE OEMs have already brought immersion-ready servers to the marketplace,” said Kosch. “It is well known that Microsoft has adopted two-phase immersion cooling, and Amazon is relying on direct-to-chip cooling. There is a place for single-phase immersion cooling in this market.”
Fig. 2: Single-phase immersion cooling racks. Server blades are inserted vertically into a dielectric coolant. Here they’re shown with mobile data-center racks. Source: GRC
More prosaically, location also matters when optimizing cooling. Facebook located one data center in Prineville, Oregon, partly because of the cooler climate. “They could duct in outside air that was at a lower temperature, which made it easier for them to cool because they didn’t have to chill the air as much,” said Woo.
At the extreme, Microsoft had a demonstrator project named Natick, in which servers were placed in a hermetically sealed container that was submerged in the ocean for two years. “They submerged a small data center in the ocean to have a very well thermally controlled environment,” explained Woo. “You’re surrounded by constant-temperature water that has a tremendous capacity for wicking away heat.”
Not only did the servers survive, but the more effective cooling resulted in higher reliability — one eighth the server failure rate of equivalent land-based servers.
More efficient computing
While part of the goal of these efforts is to allocate a bigger share of the energy used to actual computing, that’s not really the end game. The real goal is to do more work, and that requires not just more energy for computing, but also more efficient computing.
This can happen with many little improvements here and there, but there are three larger-scale improvements that will have more impact.
One is simply the ability to invent new CPU architectures that are inherently more efficient. This isn’t just about lower-power circuit design, but rather rethinking how the computing is done – and how data can be moved less often as that computing is performed.
“There are new processor architectures playing out now that will save processing power,” said Shukla.
Machine learning in particular is rapidly becoming a big energy user. But there are two aspects to machine learning. The one that will occur the most is inference – using a known algorithm to solve problems. But this can be done quite efficiently – so much so that it is being done at the edge even at microwatt levels for some applications.
Analog circuitry is helping some of these architectures, using ideas like in-memory compute. “Multiply and accumulate can be done in analog,” said Shukla, alluding to the way memories can be used for analog computation.
The real machine-learning energy hog is training, and that is done almost exclusively in data centers. Continued effort is needed to reduce the energy consumed when training new machine-learning models.
Finally, farther in the future, photonic computing holds promise. The computing done by photonic “circuits” is nominally lossless. The energy expended is largely in the generation of the laser light. What happens to the light can be managed at little energy cost.
“Lightmatter is claiming a 30X power reduction,” said Shukla. The company is developing photonic computing.
Coupled with possible architecture changes for photonics as well as the increased use of photonics for interconnect, this also can help to bring energy usage down per unit of computing work done.
Practical energy limitations
While it’s easy to paint a picture of endlessly increasing energy usage, there are practical limits to that growth for a given data center.
One aspect is regulatory. When a new data center is permitted, it is allocated a certain amount of energy from the grid – in accordance with the energy available to feed both the data center and the other needs of the municipality.
“Municipalities or places that you’re willing to put a data center are obviously letting the data centers be bigger and bigger,” said Woo. “But there’s got to be some limit, and that limit is going to be set by the government of the place that you’re going to be putting the data center.”
The other limitation is the capacity that’s built into the building when constructed. Forward-thinking architects will always provision for growth – even in the expectation that the permitted energy usage might increase. But once that building is complete, increasing the power distribution capacity may not be trivial.
That places something of a limit on the energy that can be consumed by an extant data center. Greater growth then comes not so much from a building using more energy, but from the building of new data centers.
Sources of energy
While we’re getting more bang for our energy buck these days, the fact remains that data-center power is still rising – just at a more measured pace. And given the amount of additional data processing expected over the next decade, no one believes net energy consumption actually will come down.
As more and more devices connect to the internet – and, in particular, as electrified and autonomous vehicles connect – data transport is expected to grow dramatically. At the very least, that means moving and storing all of that data.
But that data is of no value if it undergoes no computing. Analytics and other operations, many involving machine learning, will then have that much more data on which to operate. This means that the total amount of work to be done will grow dramatically, likely outpacing gains in efficiency.
Where will this energy come from? There are two challenges. First is the ability to generate more energy – preferably near the data centers. But it is needed at a time when we’re also trying to de-carbonize energy generation, relying on renewable sources of energy and weaning ourselves off of fossil fuels.
That is a far larger challenge. Exactly how we accomplish it is not yet clear. While the data center’s share of global power usage remains below 10%, it is a visible portion. And it can be argued that, unlike many of the other segments that may indeed reduce net energy consumption, the fact that it will continue to grow means its share of the total will grow even faster.
“People are ready to do things like placing their data centers close to hydro and renewable energy, where they can get a lot of energy from something that’s close by,” said Goldman.
New solutions are likely to unfold as more focus is devoted to this aspect of the data-center power story. “Do I have a clear path on how we de-carbonize?” asked Schirrmeister. “Not yet, but smart engineers will figure this one out.”
Conclusion
Ultimately, there’s so much work to be done that there is no indication that energy usage will come down any time in the foreseeable future. We may find friendlier sources of that energy, but the appetite will not be easily appeased.
Given a choice between lowering energy usage overall and doing more work with it, the choice will be obvious. It’s the same choice that chip designers make every day. “With each new process, we have a choice — make it more powerful or decrease power for the same capabilities,” said Goldman. “We never go for decreased power. We go for more powerful capabilities. And with the data centers, we’re seeing something analogous to that.”
Woo summarized the bottom line for semiconductor developers: “The actionable item for people like us in the semiconductor industry is to make individual computation more power efficient – no matter what.” Data center designers will then do as much work as possible with it.
Optical co-packaging operations were undertaken in order to reduce power consumption and enhance latency. That did not strike me as one of the solutions to data center power use. Could you please make a comment?
Would you agree that reducing the number of racks used to house the same number of power consuming equipment (switches, power supplies etc.) is a good way to reduce the total power consumption required in a data center?
Most data centers will never reduce the number of racks, but they can get more performance/watt out of each rack. The important metric is less about power per rack than performance per rack with a flat or reduced power budget. That reduces the amount of cooling required, as well.