Advanced Packaging’s Next Wave

A long list of options is propelling multi-chip packages to the forefront of design, while creating a dizzying number of options and tradeoffs


Packaging houses are readying the next wave of advanced packages, enabling new system-level chip designs for a range of applications.

These advanced packages involve a range of technologies, such as 2.5D/3D, chiplets, fan-out and system-in-package (SiP). Each of these, in turn, offers an array of options for assembling and integrating complex dies in an advanced package, providing chip customers with many possible ways to differentiate their new IC designs.

But each packaging approach also comes with its own set of tradeoffs. Moreover, there are so many possible configurations that making choices for a particular application can be challenging even for the most sophisticated design teams.

Still, advanced packaging is playing a bigger role across the semiconductor industry, and that trend likely will continue. Networking equipment, servers, smartphones and even watches are among the applications that have adopted advanced packaging. Not all chips require advanced packages. In fact, the vast majority of chips are assembled and housed in mature and commodity packages. But even for these products, IC vendors still want new packages with smaller form factors and better electrical performance.

Advanced packaging promises to solve these and other challenges. For example, in systems, data moves back and forth between a separate processor and the memory devices on a board. But at times this exchange causes latency and increases energy consumption, which is referred to as the memory wall. One way to solve the problem is to bring the memory and processor closer together and integrate them in a package.

That’s not the only application for advanced packaging. Traditionally, to advance a design, IC vendors develop an ASIC. Then, vendors will shrink different functions at each node and pack them onto the ASIC. But this approach is becoming more complex and expensive at each node. Many are looking for alternatives. One way to get the benefits of scaling is to assemble complex chips in an advanced package. In some cases, the advanced package mimics a traditional ASIC at lower costs.

Assembling different and complex dies in a package is sometimes referred to as heterogeneous integration. “What we are seeing is what you might call a renaissance in packaging, a renaissance in design, and a renaissance in many areas for heterogenous integration,” said Bill Chen, a fellow and senior technical advisor at ASE, in a presentation at IMAPS’ recent 17th Annual International Conference on Device Packaging.

At IMAPS and other recent events, vendors presented more details about their new packages and provided a glimpse of what’s ahead. Among them:

  • Samsung introduced a 3D technology, which stacks logic and memory dies in a package. It also devised a package that combines an AI processing function and memory.
  • Amkor, ASE and TSMC are developing new high-end fan-out packages, which integrate logic and more memory cubes. They are also developing fan-out for 5G phones and other apps.
  • i3 is developing a SiP stacking technology.
  • Many are pursuing chiplets. For this, a chipmaker may have a menu of modular dies, or chiplets, in a library. Customers can mix-and-match the chiplets and connect them using a die-to-die interconnect scheme in a package.

Fig. 1: Key trends in advanced packaging. Source: KLA

More 2.5D/3D
Today’s systems incorporate memory, processors, storage and other components. Memory and storage come in different forms and are arranged in a hierarchy. In the first tier of the hierarchy, SRAM is a fast memory type that’s integrated into the processor to enable fast data access. DRAM, which is used for main memory, is separate and located in a module. Disk drives and solid-state storage drives are used for storage.

In PCs, these individual components are assembled on a board. But this topology is inefficient for servers in a data center. Moving data back and forth from each individual component, namely the processor and memory, creates latency.

Over the years, vendors have developed various packages to address the memory wall, namely 2.5D/3D. Used in the industry for several years, 2.5D/3D packages are typically found in high-end applications like networking equipment and servers.

In 2.5D, dies are stacked or placed side-by-side on top of an interposer, which incorporates through-silicon vias (TSVs). The interposer acts as the bridge between the chips and a board, which provides more I/Os and bandwidth.

In one example, an FPGA and high bandwidth memory (HBM) are placed side-by-side in a 2.5D package. An HBM is a DRAM memory stack, which boosts the memory bandwidth in systems. “That’s a very important factor in AI,” said Mike Kelly, vice president of advanced packaging development and integration at Amkor, in a presentation at IMAPS. “You are getting the HBM DRAM stacks close to the processor. Basically, you are getting more memory bandwidth at a lower power point. You are not pushing all of that data back and forth off package and to other forms of memory.”

But 2.5D is expensive and difficult to make. Take HBM for example. Using various process steps, tiny copper microbumps and pillars are formed on the top of each DRAM die. One die is flipped and the bumps on each side of the dies are bonded together. Bumps and pillars provide small, fast electrical connections between different devices.

The most advanced microbumps/pillars are tiny structures with a 40μm pitch. The height of each pillar is 15μm to 30μm with 10μm to 20μm in R&D. “With reducing bump dimensions, several critical reliability issues arise,” said Priya Mukundhan, director of thin films product management at Onto Innovation, in a paper. “For microbumps to be useable for stacking, their individual height and die-level coplanarity have to be measured with very high accuracy and precision.”

Going forward, meanwhile, the industry continues to develop new forms of 2.5D. On one front, memory vendors are developing new and faster DRAMs at smaller geometries, enabling higher-capacity HBMs.

For example, Samsung’s new HBM2E technology doubles the capacity over previous versions. The latest version stacks eight 10nm-class, 16-gigabit DRAM dies on a buffer chip. Samsung’s HBM2E solution provides 16GB of capacity with data transfer speed of 3.2Gbps and a memory bandwidth of 410GB/s per stack.

In addition, Samsung recently unveiled a pair of next-generation 2.5D technologies. First, Samsung introduced I-Cube4, a 2.5D solution that accommodates four HBM2E stacks and one logic die in a package. Second, the company introduced HBM-PIM, a device that integrates HBM with an AI processing unit in the same package. HBM-PIM brings processing power directly to where the data is stored by placing a DRAM-optimized engine inside each memory bank, enabling parallel processing while minimizing data movement.

Samsung is bringing machine learning into the package. A subset of AI, machine learning crunches vast amounts of data and identifies patterns in systems. “HBM-PIM is the industry’s first programmable PIM solution tailored for diverse AI-driven workloads such as HPC, training and inference,” said Kwangil Park, senior vice president of memory product planning at Samsung.

Machine learning is pushing 2.5D in other directions. For some time, IC vendors have developed new chip architectures for AI. Many of these chip architectures must accommodate more HBMs and logic dies. In some cases, a large chip architecture with many dies won’t fit on a single interposer in a 2.5D package. It may require two or more interposers to accommodate all dies.

To develop large interposers, chipmakers pattern several interposers on a wafer using a lithography scanner. The scanner can print features in a field size of 26mm X 33mm. That field size denotes what many call the reticle limit.

So an interposer at reticle size is roughly 26mm x 33nm. Some chip architectures require an interposer larger than the reticle size. “A large area interposer can be fabricated by splitting the interposer design into multiple sections where each section is smaller than the maximum field size of the step-and-repeat lithography system,” according to a paper from Ultratech and others. (Ultratech is owned by Veeco.)

Once the wafer is processed, individual interposers are stitched together, forming a larger interposer. For example, a 2.5D package with an interposer that’s 2X the reticle size (<1,600mm²) can accommodate a large logic die and 2 to 4 HBMs. 2.5D packages at 4X and 6X the reticle size or even larger are shipping or in R&D.

Beyond 2.5D, the next big thing is 3D-ICs, which stack logic on memory, or logic on logic, in an advanced package to create a system-level design. Intel, Samsung, TSMC and others are working on 3D-ICs. For example, Samsung recently rolled out X-Cube. In one application, Samsung stacked an SRAM die on a logic chip.

This solves a major problem. In systems, SRAM is fast, but it takes up too much real estate on the board. “(Stacking SRAM on logic) frees up the space to pack more memory into a smaller area,” said Seung Wook Yoon, corporate vice president of Samsung.

There are other applications for interposers beyond 2.5D/3D. For example, a system has a board with multiple components, but one die and/or package could be faulty or obsolete. It makes little sense to develop a new board. To solve the problem, QP Technologies has developed a new interposer design solution.

First, you source a new die and/or package. Then, QP Technologies develops an interposer. The top of the interposer matches the footprint of the new device. The bottom matches the footprint of the old device on the motherboard.

This solution can be used for any number of package types. “We create the interposer with matching bump landing pads that have traces extending out to wirebondable pads,” said Rosie Medina, vice president of sales and marketing at QP Technologies. “Next, we take the flip-chip die and bond it to the interposer, and we attach the bonded die on the interposer into the off-the-shelf package. Finally, we wirebond from the interposer to the package. The customer now has a standard package they can test or assemble to their board.”

Fan-out expands
While 2.5D/3D packages provide high I/O counts, the technology is expensive, due in part to the cost of the interposer. This, in turn, fuels the need for advanced packages without the interposer.

That’s where an advanced package type called fan-out fits in. In one example of fan-out, a DRAM die is stacked on top of a logic chip in a package. Fan-out doesn’t incorporate an interposer, making it less expensive than 2.5D.

In the fan-out flow, the chips are processed on a wafer in a fab. The chips are diced and placed in a wafer-like structure, which is filled with an epoxy mold compound (EMC). This is called a reconstituted wafer.

Then, redistribution layers (RDLs) are formed in the package. RDLs are the copper metal connection traces that electrically connect one part of the package to another. RDLs are measured by line and space, which refer to the width and pitch of a metal trace.

The RDLs replace the expensive interposer in 2.5D, but there are some challenges. “When chips have been overmolded with EMC, the resulting reconstituted wafers typically have significant stress and warp,” said Arthur Southard, a researcher at Brewer Science, in a paper. “(Temporary bonding) materials can be used in this case to help control the wafer warpage.”

Then, when the dies are embedded in the compound, they tend to move, causing an unwanted effect called die shift. This impacts the yields.

There are other challenges. “Next-generation RDL applications face new electroplating and integration challenges,” said Manish Ranjan, managing director at Lam Research. “Key development efforts for fine line RDL includes grain engineering and managing undercut performance. As companies push new integration schemes for sub-1μm RDL structures, we expect that the plating process will be similar to damascene process.”

Going forward, vendors continue to develop fan-out, which is split into two segments—standard density and high density. Geared for mobile and IoT, standard-density fan-out is defined as a package with less than 500 I/Os and greater than 8μm line and space. High-density fan-out has more than 500 I/Os and less than 8μm line and space.

Several vendors are developing high-density fan-out packages for 5G smartphones. A fan-out package combines RF chips and the antenna in the same unit, thereby boosting the signal quality. “The antenna-in-package module is an important part for the development of 5G,” ASE’s Chen said.

Amkor, ASE, TSMC, and others are developing high-density fan-out packages with HBMs, which are used in servers and networking equipment. In some cases, high-density fan-out with HBMs competes against 2.5D. Both 2.5D and fan-out are viable and have their place.

“In general, for large systems with four or more HBMs, most customers are staying with 2.5D,” Amkor’s Kelly said. “For smaller systems and new designs, we are seeing some products being designed into S-SWIFT, mostly with two HBMs or fewer.”

S-SWIFT is the name of Amkor’s high-density fan-out line. “A multi-die module is created with the high-density fan-out, and then that module is attached to a standard flip-chip IC package substrate. The technology features RDLs with 4-6 layers, and a 2μm line and 2μm space with 1.5μm/1.5μm in R&D,” Kelly said.

Meanwhile, ASE also is developing more advanced forms of its fan-out technology, called Fan Out Chip on Substrate (FOCoS). “The multi-die package has 1 ASIC surrounded by 8 chiplets, assembled using a fan-out chip-last version of ASE’s FOCoS. It has three interconnecting RDL layers, plus two UBM layers, one for the C4 bumps and one for the package connections to the outside world, for a total of 6 metal layers. The current design uses 2μm line/space RDLs, with finer line/space in engineering,” said John Hunt, senior director of engineering at ASE. “ASE is also working with customers on other combinations of die, as well as FOCoS using embedded bridge die for the high-density interconnectivity.”

Other fan-out technologies are also in the works. At IMAPS, Nepes described its first M-Series fan-out technology, a package-on-package solution that stacks a memory device on a logic chip.

Nepes’ M-Series fan-out can be manufactured on round wafers or a 600mm x 600mm panel. A panel processes more packages than a round wafer, which reduces the cost. For example, a 300mm wafer can process 2,500 6mm x 6mm packages, but a 600mm x 600mm panel can accommodate 12,000 packages. Fan-out packaging on a large square panel is more difficult, and mass adoption isn’t expected anytime soon.

Fraunhofer Institute for Reliability and Microintegration, meanwhile, described a sensor platform based on fan-out. The platform consists of an SoC. Sensors are stacked on the SoC and integrated into a package.

Chiplets vs. SiP
2.5D/3D and fan-out packages aren’t the only options. In addition, there are various methodologies to create a custom advanced package, namely chiplets and SiPs.

In chiplets, customers can mix-and-match dies and connect them in a package. A chiplet-based design can be incorporated in an existing package type or a new architecture.

The idea behind chiplets is to break up a larger monolithic chip into smaller dies. This supposedly improves yields and lowers the cost. “In many cases, die yields can be optimized at both the chiplet level and the final IC,” said GC Hung, vice president of technology development at UMC. “The chiplet approach to SoC design allows architects the ability to select specific silicon technologies, which best meets the requirements for each key chip function. Performance-driven functions could leverage bleeding-edge finFET technologies. Custom analog could be implemented on legacy technologies, while mainstream technologies could be used for the remainder of the design.”

Indeed, chiplets give customers several options. “Increased momentum for chiplets is driving increased R&D for next-generation heterogenous integration solutions,” Lam’s Ranjan said. “Several packaging approaches such as hybrid bonding, silicon interposer, or fan-out may be chosen, depending on the price and performance requirements. Moving forward, we expect advanced packaging solutions to play an increasingly important role in enabling future semiconductor innovations.”

Chiplets aren’t required for all chip designs. For many applications, the existing packages are adequate. And not all IC vendors have the pieces in-house to develop a chiplet-like design.

Still, a few companies have developed chiplet-like designs. Newer versions are in R&D. But developing these products is challenging. For example, if one die is faulty in the package, the product may fail.

This, in turn, requires a sound process control strategy. “The move toward chiplet architectures is generating a number of inspection and metrology challenges in advanced packaging,” said Chet Lenox, senior director for industry and customer collaboration at KLA. “First, incoming die quality requirements are getting more stringent as more individual die are being integrated. This is increasing the need for highly sensitive die-level inspection, metrology, and sorting before the package is even assembled. Second, the cleanliness requirements of equipment used for the chiplet packaging process are becoming stricter and beginning to approach what we are used to in front-end semiconductor manufacturing.”

Besides chiplets, SiP is also a viable solution. A system-in-package integrates several components into a single package, enabling it to function as an electronic system or subsystem.

Any number of components could be incorporated into a SiP, such as antennas, dies, MEMS and passives. Selecting from these options, a customer can develop a custom SiP to match a given requirement.

SiPs can be used in any number of products, such as automotive systems, smartphones and watches. In smartphones, SiPs can be used to house power management ICs, as well as RF front-end and WiFi modules.

Over the years, Apple has integrated an SiP within its smartwatch products. The latest Apple Watch Series 6 incorporates the processor and other functions in a so-called S6 System in Package (SiP).

The S6 SiP incorporates Apple’s A13 Bionic chip, a dual-core processor. Based on Arm’s 64-bit processor technology, the A13 Bionic is 20% faster than the chip in the previous watch.

Others are also developing new forms of SiPs. For example, i3 Microsystems described more details about its Heterogeneous System-in-Package (HSIP) Module technology.

HSIP embeds a die within a substrate with routing layers. “We often refer to it as an embedded interposer, since an HSIP has two-sided interconnects with feed-throughs that go through the core,” said Justin Borski, director of business development at i3. “One unique capability of our device architecture is that the core thickness is highly customizable. we can produce device designs with embedded cores from 150 microns to 1.2 millimeter in thickness, and still deliver signals through that core with our through via technology.”

At IMAPS, i3 described a technology that stacks two HSIPs on top of each other and connected with TSVs. “We are currently in early production on both two-high and a variety of single-stack devices for select DoD and Defense Industrial Base (DIB) customers,” Borski said. “The two-high stacked HSIP systems have been in initial production for about a year for a major program.”

Clearly, packaging is a vibrant market with many new and different options — perhaps too many options.

Understanding each option is challenging. Finding the right one is even more difficult.

Related Stories

Making Chip Packaging More Reliable

What Goes Wrong In Advanced Packages

Emerging Apps And Challenges For Packaging

The Race To Much More Advanced Packaging


Leave a Reply

(Note: This name will be displayed publicly)