Process cost and yield issues delay the adoption of hybrid bonding.
The next generation of high-bandwidth memory, HBM4, was widely expected to require hybrid bonding to unlock a 16-high memory stack. A JEDEC move made that unnecessary with this generation, but it’s merely a postponement, not a cancellation.
HBM has been in high demand for AI in data centers — especially for training. Data movement dominates energy consumption, and high-bandwidth memories can feed more data more quickly and more efficiently than standard DDR flavors and variants.
“We see a lot of requirements for memory, whether non-volatile or volatile,” said Pax Wang, director for advanced packages at UMC. “But high bandwidth memory is the most important one in the era of AI.”
HBM technology stacks multiple memory dies on top of one another. “HBM stacks are currently available with 12-high stacks and are on their way to a 16-high stack configuration,” said Hamed Gholami Derami, business development engineer, packaging solutions at Brewer Science.
Additional capacity can be achieved by increasing the number of layers in the stack and/or adding more memory cells per die. Until recently, JEDEC had specified a maximum stack height of 720µm, and that wasn’t high enough to allow 16 layers, even as various height factors have shrunk. “The die thickness is constantly decreasing (currently at 30 to 50μm) along with a decrease in bump height, die-to-die distance, and TSV pitch size to accommodate the height limitation,” said Derami.
It may seem odd that TSV pitch, which is a horizontal dimension, would affect the die-to-die thickness, which is a vertical dimension. But that pitch affects bump height. “TSV pitch size and bump height have a direct relationship,” Derami added. “Smaller pitch size means smaller bumps, too.”
Performance, meanwhile, specified as total bandwidth, can increase with wider interfaces, as well as faster signaling per pin. HBM4 doubles the number of channels while widening the interface. Pin signaling for HBM4 also will be faster than HBM3, although it lags HBM3E.

Table 1: HBM4 compared with HBM3 and HBM3E. Source: Internet research
The wider interface poses the challenge of requiring many more pins in roughly the same space. Microbump pitch historically has been in the range of 40µm, but with HBM4, that pitch will be moving closer to 10µm. The big question has been how to bond the pads in a die stack.
Hybrid bonding was the planned solution
“Currently, mass reflow (MR) with molded underfill, and thermocompression bonding (TCB) using non-conductive film are the primary chip-to-chip stacking assembly methods,” said Vikas Gupta, director of engineering and technical promotion at ASE Group.
The expected solution for reducing stack height has been to turn to hybrid bonding. Even if the thickness of each die remains the same, the absence of microbumps means that less space is necessary between each layer, shortening the stack.
“The drive to increase stack height — now exceeding 12 to 16 layers — is being fueled by advancements in hybrid bonding and wafer thinning,” said Aftkhar Aslam, CEO of yieldWerx.
“Hybrid bonding provides a bump-less 3D stacking assembly option as HBM specifications and performance requirements continue to push the limits of interconnects and assembly processes,” said ASE’s Gupta.

Fig. 1: Generic HBM structure. This shows microbumps. With hybrid bonding, the gap between the DRAM dies would disappear. Source: Adapted from an image on Wikipedia
Hybrid bonding is an expensive process that requires new equipment. The generation that employs it first will be more expensive on a per-package basis (although that may not translate to greater expense per bit given higher capacity).
The process also introduces challenges in other areas, such as testing. Given an expensive unit, such as a memory stack with more than a dozen chips, yields must be high. All it takes is one unrecoverable defect in one die to ruin the whole stack. So assembling only known-good dies is an important way of boosting yield.
“Yield will be the biggest problem,” said UMC’s Wang. “With microbumps, we can test the memory layers before soldering the microbumps together, but if we change to hybrid bonding, the testing flow will be very difficult.”
Testing prior to bonding might sound obvious, but it becomes more complicated in two ways. First, hybrid bonding requires a pristine pad surface. A test probe can create damage to the pad or introduce particles. That’s in addition to the challenges of narrow pad pitch.
“Hybrid bonding requires a very clean surface because no particles are allowed on the bonding interface,” said Wang. “Testing is a particle source.”
For this to work, engineers must have a means of refinishing the pad surface after testing. “Our process flow includes intermediate testing before bonding the layers — a specially designed process flow,” explained Wang. “We use surface planarization to repair the overall interface before bonding.”
The second challenge is that foundries are accustomed to shipping complete units to OSATs for testing. But some foundries are assembling the stacks. So rather than shipping chips to an OSAT only for them to be shipped back for stacking post-test, foundries may need to acquire test equipment so they can build the stack before shipping it as a complete unit to OSATs for packaging.
In-process inspection and monitoring also are challenged by hybrid bonding. “Materials innovation — such as low-warpage substrates, ultra-flat dielectric layers, and enhanced underfill formulations — is crucial, but so is process monitoring,” said Aslam. “The inspection stack now includes optical interferometry, acoustic microscopy, and inline void detection for micro-voids and misalignment. On the yield management side, tools enable vertical genealogy analysis, tracing every memory die in the stack back to its wafer lot, burn-in history, and bond alignment metrics. By correlating test data from each layer of the HBM stack with assembly and process metrology data, engineers can isolate defects related to hybrid-bond alignment, TSV-resistance drift, or material delamination, turning what would traditionally be a 3D blind spot into a transparent, traceable process window.”
Height change provides breathing room
JEDEC revised the module height limit from 720µm to 775µm, which affords enough room to allow microbump bonding for HBM4. However, HBM5 and its successors are expected to take advantage of hybrid bonding.
“Hybrid bonding reduces the thickness of the solder interconnects, which puts it on HBM roadmaps,” said Wang. “But with the JEDEC standard revision at the beginning of this year, we have seen the delayed adoption of hybrid bonding technology. For 18-high or 20-high stacks — maybe HBM4E — perhaps we will see hybrid bonding start to gain momentum.”
In addition to the capacity and bandwidth improvements, the energy per bit is expected to drop by 30% to 40%, making HBM4 much more efficient than its forebears — and that’s with microbumps. “Hybrid bonding has an order of magnitude lower energy per bit as compared to current microbump solutions,” said Gupta.
Two logic-related changes also are expected with HBM4. The first deals with the stack’s base die (also called the logic die). That die contains all the logic necessary to operate the stack, and it’s what memory controllers on other chips will talk to. But up to this point, that’s been more or less a standard die, with each unit sold having the same one.
With HBM4, companies are expected to specify custom base dies to better align the stack behavior with specific applications. A vanilla version should be available for the broader market, but some large companies, such as AMD and Nvidia, are planning to integrate greater functionality into the base die and potentially offload some work from the processors.
“Custom die features will continue to evolve along with the overall compute architecture,” said Gupta. “This evolution will directly impact power consumption, efficient power delivery requirements, and associated thermal management.”
HBM4 also will include a newer memory feature intended to help protect against row hammer attacks. Called Directed Refresh Management (DRFM), it helps refresh rows that might have been victims of a row hammer event. And HBM4 will have improved reliability, availability, and serviceability (RAS) features.
The future requires new equipment and materials
Moving forward, efforts will continue to build taller, faster stacks. But thinning the memory layers while narrowing signal pitch will necessarily involve more precise equipment and better materials.
“It requires better and more accurate die placement tools, along with improved die-to-die and die-to-wafer bonding equipment, for both MR and TCB processes,” said Derami.
Even with the move to hybrid bonding, there may still be some MR and TCB bonding. “Not every interconnect will become hybrid bonding in HBM’s case,” Derami said. “Companies are exploring a solution where DRAM dies are face-to-face hybrid bonded, and these bonded pairs will be stacked back-to-back using microbumps. This seems to be a workaround due to difficulties stacking all the DRAM dies using hybrid bonding, so those TCB and MR tools will still be used in these integration schemes.”
Thinning wafers brings new challenges as well. “As HBM architectures evolve to include more memory layers and finer interconnects, achieving uniform planarity for ultra-thinned dies and thermal resilience across the stack, especially for hybrid bonding schemes, has become increasingly complex,” said Derami, noting those materials will need to be thermally stable to survive processing. “Advanced materials are needed to thin the dies down to extremely thin thicknesses with great fidelity and for device stability, using materials with higher thermal conductivity, thermal stability, and improved mechanical properties.”
An HBM4E version is expected to hit production in the 2027 timeframe, which quickly follows the 2026 expected production date for HBM4. Individual companies have declared their goals, with Samsung looking at over 13 Gb/s per pin with a total bandwidth of 3.25 TB/s. Power efficiency should also improve.
The other factor gating hybrid bonding adoption is the pad pitch. Current techniques work for pad pitches down to around 10 µm, so using hybrid bonding at that pitch wouldn’t make economic sense. The HBM4 pad pitch is 10µm, which also supports a hybrid-bonding delay.
One more generation of microbumps
While HBM4 originally was expected to require hybrid bonding, that’s no longer the case. Units built with hybrid bonding would have a harder time competing on price with those built using microbumps, so the latter will set the price bar to make a more expensive assembly process uneconomical.
HBM5 is now the generation where mass adoption of hybrid bonding is expected to be necessary. It’s a couple of years further out than HBM4, with availability at the end of this decade. Stacks are expected to remain at least at 16 high, with a doubling of the interface to 4,096 bits and a bandwidth of 4 TB/s.
That provides some breathing room for memory technologists to develop the complex processes necessary for such advanced memories.
Related Articles
HBM Leads The Way To Defect-Free Bumps
Bump scaling is pushing defect inspection to the limit. What comes next and why it matters.
The Best DRAMs For Artificial Intelligence
The choice of DRAM depends on where the action is.
What Comes After HBM For Chiplets
The standard for high-bandwidth memory limits design freedom at many levels, but that is required for interoperability. What freedoms can be taken from other functions to make chiplets possible?
Leave a Reply