There are numerous ways to remove heat from chips, and more are on the way.
All electronics generate heat, and that heat must be removed to ensure those electronics don’t overheat. Moving air has been the predominant approach for decades, with liquid cooling limited to particularly intense computing workloads, largely in the supercomputing domain.
With the rise in AI, data-center power density has grown to the point where liquid cooling is now seeing a larger buildout. Single-phase cooling predominates currently, but two-phase and immersion cooling are also growing as options.
This is a relatively recent phenomenon brought about by the sudden uptick in power density due to AI computing. “Computer processors in the 2000s or 2010s were at a couple of hundred watts,” noted Bernie Malouin, founder and CEO at JetCool. “The overall power level largely stayed the same until the last couple of years.”
“It used to be an exception to have water cooling,” said Marc Swinnen, director of product marketing at Synopsys. “It was an extreme thing. But now I’m surprised at how standard it is. Almost every high-performance system has water cooling.”
Liquid cooling comes in many forms, and there’s no single best solution. Developers can’t just stipulate liquid cooling. They must determine what type of cooling is best. That decision has a strong impact on data center infrastructure, and mixing and matching different cooling approaches isn’t a practical option.
“The supercomputing industry has been a liquid-cooling pioneer, where you’ll have a metal plate that will sit on top of your processor and, in many cases, the memory as well if it’s HBM or something like that,” said Steven Woo, fellow and distinguished inventor at Rambus. “It’s hollow on the inside, and there are rubber tubes that feed in and out. There’s an intake and an outlet, and you’re flowing liquid in a continuous circuit. Now people are laying the groundwork for doing immersion.”
Any cooling approach must be identified early, when the architecture, performance, and power are undergoing early simulation. “You must focus at the architecture level, at the beginning itself, to determine the power numbers, the heat flux, and the cooling methodology,” said Satya Karimajji, senior engineer for SoC engineering at Synopsys.
To be clear, although power density is driving the move to liquid cooling, it’s not about reducing power. “It’s about trying to get more signals in the same footprint more than reducing the power of the data center,” observed Rob Kruger, product management director at Synopsys.
The problem with air cooling
Most data centers and other loci of computing rely on air cooling. Air enters the building, is chilled, and is then blown under the raised floor, where it keeps the room cool. Individual server fans push that air through the chips, where the air heats up and is then blown back out into the atmosphere.
That process involves liquid in the refrigerant, but it’s far from where the cooling happens. It’s also an open-loop system that brings new air in and blows heated air out. It works, but only to a point. Above a certain level of heating, it can’t remove the heat fast enough without pushing the fans harder than is practical, resulting in unsafe levels of noise and other issues.
“With air, you have fans to blow the air around, and the faster the air flows, the more effective heat is taken away,” said Robin Bornoff, innovation roadmap manager at Siemens EDA. “But there is a limit. The bigger the fans are, the bigger the server must be, reducing compute density.”
Water is a more effective coolant than air, although it’s not typically employed alone. “It’s around 1,000× denser than air,” said Bornoff. “Its thermal conductivity is 20× greater than that of air. You can extract so much more heat using water compared to air.”
That results in a greater ability to pull heat away, and pumping the liquid to a heat exchanger moves the heat away from the circuits. Eventually, that heat warms up the air, but that happens somewhere else, away from the server racks.
Three ways to cool chips
Liquid cooling comes in three possible forms. The most common one today is single-phase liquid cooling since the coolant remains in the liquid phase the entire time. With this system, one relies on the liquid’s greater thermal conductivity and heat capacity to do a more effective job than air can do.
Less common today but heavily researched is two-phase cooling. The intent is to take advantage of the immense amount of latent heat necessary to change from liquid to gas. “The phase change actually absorbs more heat than the temperature change from 0 to 100, so it’s very efficient at removing heat,” said Swinnen. Unlike single-phase, here the coolant literally boils, removing far more heat than is possible with single-phase.

Fig. 1: Conceptual drawing of single-phase vs. two-phase cooling in a cold plate. With single-phase, the coolant remains in liquid form to be cooled in the CDU. In two-phase cooling, the coolant boils, the vapor is removed and recondensed to evacuate much more heat. Source: Bryon Moyer/Semiconductor Engineering
“Boiling is brilliant,” said Bornoff. “It’s a very resilient method for heat removal, but it, too, has its limit.”
Even as the liquid boils, it’s important that it remain in contact with the heated surface. “You end up with a thin, microns-thick layer of water over the surface,” Bornoff said. “The heat is taken into that small bit of liquid, which is then passed to the air bubbles. The air bubbles disappear, replaced by new liquid. As long as you have bubbles forming with some liquid between them, that’s the maximum heat-transfer efficiency.”
If the heat flux is too high — that is, if heat is being emitted per area too quickly, as measured in W/mm2 — and the system can’t keep up, then the layer of water at the bottom will evaporate, too. In that situation, there’s no liquid in contact with the heat source. Instead, it’s water vapor, which is a gas — essentially air cooling again. Cooling will fall off a cliff at that point. This level of heat flux is called the critical heat flux (CHF).
A third liquid cooling approach is full immersion. It involves a tank full of liquid in which entire servers are immersed. This liquid must be dielectric so that it causes no shorts. It must also be non-corrosive to keep the electronics pristine. Immersion can operate as a single- or two-phase cooling system.
In this case, the liquid is still pumped out for cooling. Originally, a single pump in the building distributed the coolant throughout, but that turned out to be inefficient based on piping losses and such. Now, tanks have closer recirculators — sometimes called economizers because they’re more efficient to operate.
Immersion cooling removes heat from all components, but it does so more slowly than the other techniques. “If the criterion is percentage of heat capture, immersion does a really good job of that,” said Malouin. “You might capture nearly 100% of that server’s heat into the fluid. But it’s really hard to cool several-thousand-watt devices with single-phase immersion because of the thermal properties of the fluid itself.”
Different ways to apply liquids
Immersion cooling works more or less one way, but the other approaches have some variants. The most common implementation today is as cold plates, which attach to a chip package, replacing what would previously have been a heat sink for air cooling.
“The most popular thing I see walking around show floors is some kind of plate with liquid flowing into it,” observed Rambus’ Woo. “The plate contacts the important semiconductors, and there are usually grooves that direct the flow of the liquid.”
The advantage of a cold plate is that it’s a self-contained unit that can be attached to the package at assembly time. It doesn’t affect the die, chiplets, or other components inside the package.
The downside of a cold plate is that the coolant is separated from the chip by the package top, interface materials, and the bottom of the cold plate. Primary heat transfer is either down through the PCB or up through the cold plate. Aside from solder and metal lines, the intervening materials aren’t chosen for their thermal conductivity, leaving a barrier between the cold plate and the package contents.
Beyond cold plates lies what’s sometimes called direct impingement, or direct liquid cooling (DLC), meaning that coolant literally touches the die being cooled. Coolant can flow or be sprayed onto the silicon backside. Because the coolant directly touches the die, it has immediate access and can pull heat out much more readily.
The challenge is that the coolant must remain isolated from the rest of the package, and that’s not a completely solved problem at this point. Advanced packaging with multiple dies presents another challenge. If one die is the main heat source, then the cooling can focus only on that die. But if there are, for example, multiple high-power compute chiplets, each will need to be individually cooled. Plenty of research is ongoing, but high-volume usage is still nascent.
Coolants matter
It’s easy to think of water as being the obvious coolant, but more commonly it is a mix (often 50/50) of water and propylene glycol, a combination abbreviated PGW. Propylene glycol resembles automotive antifreeze (ethylene glycol). Like antifreeze, it extends the temperature range at which the coolant can remain liquid. Antifreeze describes only the cold end of what the coolant does. However, automotive-type coolant is highly toxic and is typically employed only where humans aren’t likely to accidentally ingest any.
Propylene glycol is less toxic, but also has a lower boiling point of approximately 188°C versus 197°C for ethylene glycol (at 1 atmosphere). A 50% mixture with water takes those limits down to around 105°C and 108°C, which is higher than that of water, but not by much.
For immersion cooling, the dielectric coolant is designed both to be effective and to play nicely with humans and the electronics. Older liquids may be more toxic, but modern ones are chosen to be non-toxic, non-corrosive, non-flammable, and biodegradable. And modern coolants are more expensive than PGW.
“There are interesting liquids used for immersion,” said Woo. “They’re electrically inert. I put my hand in — a company let me do that —and I didn’t realize that, because they’re non-reactive. They also don’t react with your skin, so they don’t feel right.”
Heated coolant brings one additional unexpected potential benefit. Unlike heated air, which is lost to the atmosphere, liquids operate in a closed system. “[The liquid] flows into and out of the chassis, and then goes to a heat exchanger where it exchanges its heat, cools down, and gets recycled into the server,” said Woo.
That means the heat inside the coolant could be put to use elsewhere. One idea that has seen some preliminary research is to pipe the coolant out of the data center to generate hot water for nearby residents. That allows some of the energy consumed by computing to be recovered and reused.
“The advantage of liquid is that energy is dumped into the liquid, and that’s a very efficient way of storing what would otherwise be lost energy,” said Bornoff. “Why not pump that into a local, domestic hot water circuit to feed the hot water requirements of residences in close proximity?”
Environmental considerations also matter. “You want to make sure the liquids don’t contain ‘forever chemicals,’” Woo noted.
Infrastructure changes
Moving from air to liquid involves more than changes at the chip, server, and rack levels. With few exceptions, an entire data center must be outfitted to handle liquid.
“You’ll need pumps and hoses, “said Woo. “You’ll need serviceability. There are high-reliability, low-leakage valves that you can snap on and off the server. And they have a heat exchanger. For immersion, you’re talking about tanks directly in circulation systems.”
If an entire rack, or row of racks, is liquid-cooled, then the raised floor is no longer necessary. Replacing that infrastructure, however, is the piping and handling of the liquid, which is typically pumped through a coolant distribution unit (CDU).
“The plumbing inside of some of these data centers is pretty imaginative,” Synopsys’ Kruger noted.
Managing these systems is different from existing air-cooling approaches. “Two metrics are important: — low pressure drop and low thermal resistance,” said Ali Forsyth, co-founder and CEO of Alloy Enterprises. “It allows data centers to circulate higher temperature water while still meeting the thermal requirements of the components in the racks. That means not needing refrigeration or lifted HVAC, which is a huge energy savings.”
Each cooling approach has its own infrastructure. As a result, a rack — or even an entire data center — will typically commit to one type of cooling. “You wouldn’t typically see one tray liquid-cooled and the others air-cooled,” observed Forsyth.
More than just flat metal plates
Cold plates can be sized to fit the package. However, that overlooks the fact that heat isn’t uniformly generated throughout the package. If the package houses a single die, that die will have hotter and cooler areas on its surface. And advanced packages with multiple components will include some that generate more heat than others.
As a result, some cooling solutions involve bespoke cold plates that concentrate cooling where the most heat is generated. Alloy Enterprises employs a 3D printing technique to create custom liquid routing paths inside the cold plate. Its most common coolant is propylene glycol with 25% water.
“We’ve developed a manufacturing process we call stack forging, which is a sheet-based process where we can make complex internal and external geometries in a single-piece component,” said Forsyth. “We can put large channels wherever we need, sizing and optimizing them appropriately. We put smaller-scale channels wherever we need them.”
Rather than starting with a powder that must be sintered, its process stacks multiple metal layers to build the cold plate. The surface inside is patterned using lasers to create grooves that guide the liquid. It’s possible to have multiple coolant inlets to raise cooling efficiency.

Fig. 2: Cold plates are patterned to concentrate coolant around hot spots in the package. Source: Alloy Enterprises
“Almost all 3D printing relies on melting or molten metal at some point or sintering of some kind,” explained Forsyth. “It’s really hard to make small things when you have liquid metal. It wicks, via capillary forces, into those small orifices. So, we’re able to make channel sizes that are in order of magnitude smaller than other metal 3D printing processes.”
A high-temp manufacturing step provides diffusion bonding, where the individual sheets combine into a single piece of metal. This process avoids the warping that can occur with typical metal 3D printing. “Everything heats up at the same time, and therefore we’re not introducing residual stresses due to thermal gradients,” said Forsyth.
Another company, HydroGraph, ran some two-phase experiments depositing materials on the cooling surfaces to keep them from being too smooth and causing superheating. It created sintered nickel on copper, porous copper/nickel interfaces, and graphene — its specialty — on the boiling surface. The added roughness provided nucleation sites and achieved a 152% rise in the heat-transfer coefficient (HTC) compared to bare copper, and a CHF 40% higher than bare copper. [1]
Taking a cold shower
An example of direct impingement cooling comes from JetCool, which makes a unit with small nozzles that spray coolant on a surface. The company has three ways of delivering the technology — spraying directly onto silicon, which is the most advanced configuration and is best for the highest power; as a cold plate; and as a self-contained unit for use in racks that lack liquid infrastructure. It claims to be cooling chips with power as high as 5,000 watts.

Fig. 3. JetCool’s direct liquid cooling. Liquid flows into the port on the very right (blue arrows) and is forced through jets (middle and inset). Hot liquid exits on the left port (red arrow). Source: JetCool
The self-contained model can replace the fan in an existing server, reducing power by as much as 15%. “These are typically small-scale liquid cooling loops that sit inside of servers,” explained Malouin. “This allows our customers to deploy liquid cooling in any air-cooled data center. That can reduce the server power by 15%.”
In the same way that Alloy’s grooves are patterned to match hot spots, JetCool’s jets are also positioned custom for each package.
Some DLC systems may employ high pressures. “We’re talking about a 40x difference in pressure drop for jet impingement based on customer results we’ve seen,” said Forsyth.
JetCool says that’s not what they’re doing, however. “We can achieve better performance on a given fluid power budget because we specifically don’t rely on pressure to drive performance,” said Malouin. “Typically, we minimize pressure drop because our technology works best when we maximize flow intensity, not pressure.”
No one right answer
Players in this space contend that each of these solutions has a sweet spot. No single one will wipe out the others. While cold plates are the simplest and have the lowest cost, DLC can remove heat the fastest for high-power chips. Immersion can remove more total heat since it cools everything, not just certain chips, although it may not cool high-power chips as fast as DLC.
“With the diversity in computing today, there’s a place in the market for all those different types of cooling, because different workloads, different applications, and different deployment styles and locations all have different requirements,” said Malouin.
Adding liquid infrastructure is a barrier when replacing existing air infrastructure, but it’s less so for greenfield construction, although JetCool’s self-contained unit is an option for avoiding redoing the infrastructure. Part of the return on that investment is the ability to build servers with chips that would be impossible to cool with air alone, thereby increasing the value of the servers and racks employed.
If a data center is planning to make the move to HVDC, which brings higher DC voltages all the way to the rack before dropping them down to usable levels, then that project also could be a good time to convert the cooling infrastructure.
“Multiple big changes are coming at the same time,” noted Woo. “People are talking about 400-V power distribution, or even 800 V. If you’re thinking about a big change in your power distribution, maybe you couple that with a cooling upgrade.”
Serviceability is also important. The tubes carrying coolant must be placed such that the servers themselves can still be accessed as necessary. “Serviceability can be more challenging as tubing may need to be moved out of the way to replace components,” noted Malouin.
Immersion cooling creates an even greater servicing challenge. It may be possible to remove a single server, but it also may be necessary to drain the dielectric coolant, refilling it afterward. “Immersion tanks must be opened, potentially affecting multiple systems other than the one being serviced,” added Malouin.
In general, immersion monitoring is necessary to detect any overheating and rebalance workloads to keep temperatures in check.
New cooling coming online
Both single-phase cold plates and immersion are in limited use today, but cold plates are likely to become much more popular as companies build out data centers equipped to handle AI training and high-performance computing. Nvidia’s Grace/Blackwell racks already include liquid, and liquid-enabled chassis are available on the market.
“If you go to places like SuperMicro, you can buy racks with liquid cooling in them,” said Woo. “They’re 4U boxes where the top 2U is either Nvidia or AMD engines, and the bottom 2U is a dual-socket Xeon or EPYC processor. If you go on the SuperMicro website, you see the chassis with the pipes for the liquid.”
How cold plates are provided may vary. “Sometimes cold plates are sold on the chips themselves,” said Forsyth. “In other cases, the hyperscaler or server manufacturer would take the chip, the TIM (thermal interface material), and the cold plate and build out the assembly themselves.”
Direct cooling is starting to become available, and two-phase cold plates should be appearing in a few years. Once the transition is complete, liquid cooling should feel like less of a burden since racks will be equipped for it.
The use of immersion presumably will increase, but it’s a heavier lift than cold plates or DLC and is likely to be employed more selectively.
Where possible, air is likely to remain popular for racks handling more modestly powered silicon for more mundane use. Operating costs may be lower for liquid, but the infrastructure investment must have a reasonable break-even period for the conversion to make economic sense.
The future of data centers is thus likely to include a mix of air, cold plates, DLC, and immersion. That mix will include both single- and two-phase systems. Individual data centers may be provisioned for only one type of cooling, but this mix is expected across the range of data centers.
Related Reading
Crisis Ahead: Power Consumption In AI Data Centers
Four key areas where chips can help manage AI’s insatiable power appetite.
The C MOS world is now re discovering the challenges and solutions that were already in place 40 yrs ago for Bi Polar based Supercomputers. Read up on the IBM TCM or immersion cooling in CRAY machines. None of the rather pedestrian solutions discuss address the new problem of cooling hot processor chips buried in 3 d stacks ( w/o compromise to electrical performance )