中文 English

Scaling, Advanced Packaging, Or Both

Number of options is growing, but so is the list of tradeoffs.


Chipmakers are facing a growing number of challenges and tradeoffs at the leading edge, where the cost of process shrinks is already exorbitant and rising. While it’s theoretically possible to scale digital logic to 10 angstroms (1nm) and below, the likelihood of a planar SoC being developed at that nodes appears increasingly unlikely.

This is hardly shocking in an industry that has heard predictions about the death of Moore’s Law for the past couple of decades. What is surprising, though, is the dizzying — and growing — number of market-proven alternatives. Included in that list are various types of advanced packages, some of which are already in use, as well as a slew of new materials, novel interconnect schemes, and different ways to increase density at existing process nodes. So even though nearly all design or manufacturing barriers that can be overcome with enough time, effort, and investment, in most cases there are multiple ways to achieve the same goals with improved performance, lower power, and in some cases, at a much lower cost.

“The trend that we have seen lately is that fewer and fewer companies are able to monetize the value of the most advanced-scale technologies,” said David Fried, vice president of computational products at Lam Research. “There are fewer customers at 5nm than there were at 7nm, and there were fewer at 7nm than at 10nm, because a smaller number of companies can extract value from the large capital investments needed to develop these new products. You are going to see that trend continue. If you cannot capitalize financially on the value of scaling, be it power, performance, area, or yield, then you shouldn’t scale. This decision has to be made at the product level. Certain products are going to be analyzed by their owners looking at fixed costs and recurring costs, and the owners will decide that the business side works better if you stay at 7nm and don’t jump to 5nm. You will see plenty of companies make that decision.”

Fig. 1: Moore’s Law and its real-world applications. Source: Max Roser, Hannah Ritchie, CC BY 4.0 via Wikimedia Commons/Wikipedia

While some devices and markets will support the continued economics of scaling, it’s not clear how much of that will be done in a single SoC versus an advanced package.

“Companies are being very selective about what they want to manufacture using the most advanced technology,” said Fried. “They’re manufacturing the most density-centric portion of the product using the most advanced technology, which is entirely a functional integration play. Even if they’re not getting straight-line data flow performance by going to those advanced nodes, they are getting more data flows and data paths in the same footprint. And clearly, they’ve done the calculation to show that it is an advantage that they can monetize at the product level.”

However, every custom configuration comes with its own unique tradeoffs. With planar scaling, these tradeoffs are limited because they are defined by foundry process rules. Going forward, tradeoffs need to be considered in the context of how a chip will be packaged and used. So a device might include different chips or chiplets developed at different process nodes, and those can vary greatly depending upon end applications and use cases, and by the types of data being processed. In the case of AI/ML, it could vary by the required level of accuracy or precision.

To make matters worse, a device also needs to be understood in terms of variability and in the context of other components in a package or system. Noise can impact signal integrity in adjacent chips. Mechanical stress can cause warpage and affect various types of interconnects. And nano-sized particles left over from cleaning, polishing, debonding, and etch can disrupt a system’s functionality. So, too, can the availability of components, gaps in EDA tooling, and a shortage of talent.

Choices become more confusing as the number of options grows, and as chipmakers target demands from customers in different end markets. In automotive, for example, there are multiple possible architectures for processing safety-critical data, and different carmakers often are taking unique approaches to optimize various features. Likewise, cloud data centers have developed and continue to refine chip architectures that are designed for their specific needs and data types. And in other markets, software features increasingly are being matched to hardware developed specifically for those features, whether those features are integrated into a single chip, multiple chips that are stitched together because they exceed the reticle limit, or multiple different chips or chiplets in a package.

“Certain technologies are good for certain solutions or certain problems, but they’re not going to be good for everything” said Eric Beyne, senior fellow at imec. “So for fan-in and fan-out and laminate system-in-package, there is indeed a whole set of technologies that will be useful. But it depends on what you want to solve. If you think about the RF modules in a phone, these are effectively collections of 50 components in one package. But these are components with relatively few connections to make. You cannot do the same kind of interconnect density for AI memory-logic partitioning.”

Fig. 2: 3D interconnect landscape. Source: imec

In this context, scaling is just one of many factors in a leading-edge design, and even digital logic within the same package may be developed at different nodes, depending upon how critical various types of data are to the end user. For example, AI processing (or machine learning or deep learning) data, which increasingly is being included in devices, utilizes a very different architecture than traditional processing elements in a CPU or MCU. Accuracy and timeliness of results in an AI chip is contingent on the speed of data movement back and forth between localized memories, the performance of different processing elements, and the volume of data — more data of good quality is better — as well as whether those chips are being used in the data center or in an edge device. And it may require further refinement to enable parallel or asynchronous processing, or both. But while that works well for AI chips, it’s definitely not an energy-efficient approach for other types of data or functions within a device.

Many ways forward
Once considered the benchmark for progress in semiconductors, Moore’s Law itself is splintering. Technology scaling can continue, but the economics of planar scaling are becoming difficult to justify. Getting sufficient yield at 3nm will be a challenge, and just being able to deliver power in an increasingly dense sea of transistors likely will require backside power delivery, which in turn will change how wafers and chips are handled in a fab.

Still, there is no single technology standing in the way of continued scaling. “The low-k dielectric layer, which is brittle, has been an issue at newer nodes, said Choon Lee, CTO of JCET. “But there are no major process issues even down to 5nm. And while wafer-sawing can be a critical process, nowadays the laser grooving process and parameters are well-defined.”

The real limiter is cost, and this has prompted chipmakers to look for alternatives such as mixing multiple chiplets in an advanced package, and to eke more out of every node. That has opened the door to technologies that were discussed in the past, but which never saw widespread adoption when scaling was considered the best path forward.

The ability to print curvilinear shapes on a mask using multi-beam e-beam lithography is one such technique. Rather than printing a misshapen polygon or a square hole, the shape of devices that can be printed is much more accurate. That, in turn, allows for greater density at existing nodes.

“With EUV lithography, things got much easier,” said Aki Fujimura, CEO of D2S. “Shapes that you are asked to print come out much easier with EUV than with 193i. All of the leading-edge shops are in R&D stages of ‘2nm node’ development. And ASML’s roadmap has the next generation EUV technology, called ‘High NA,’ using numerical aperture of 0.55 instead of today’s 0.33 to improve resolution. But even with EUV, beyond 2nm is going to be a challenge. There just aren’t enough photons, and there are stochastic effects. At these dimensions, it’s really starting to matter.”

In effect, this is a way of shrinking the “white space” between various components, such as transistors and memories, because shapes can be printed more accurately and closer together.

“Even if we have a pure ‘Manhattan’ design — so the layout designer draws these two rectangles tip-to-tip, whatever the minimum design rule is, even with elaborate OPC on the wafer to control the lithographic line, and pull-back and rounding from the wafer lithography process — there’s still going to be corner rounding on the actual mask,” said John Sturtevant, senior director of product development at Siemens EDA. “What’s new is that with these multi-beam mask writers, we can get more aggressive on the OPC corrections. And we can take advantage of the fact that if we know we’re going to have a curved linear mask, we can get really radical and utilize that curvature in a way that mask writers would have penalized because there were not enough cost-benefit tradeoffs.”

On top of that, scaling is starting to go vertical, so rather than measuring chips in square millimeters, they increasingly will be measured in cubic millimeters. That adds a whole new set of complexities throughout the supply chain, from design tools to mechanical stresses and various bonding techniques. It also makes it more challenging to inspect and measure everything from material deposition and etch to new materials, and to account for movement that was never considered an issue in the past.

“We have a very active program in quasi-zero die shift,” said Kim Arnold, chief development officer at Brewer Science. “You want to be able to place the die and have them move less than a micron after mold. For chip first, that’s a fundamental difference in what they’re getting from die-attach film. So if you place some die-attached film, they could move a lot. We have show results down to less than a micron for movement post-mold. You put our material down, you place the chip, you build your RDL structures, and then you do your molding. Pre-mold you don’t see much difference, but post-mold you do. That’s epoxy mold compound coming in over the top, adding stress and moving things. But is the industry ready for an alternative to epoxy mold compound? The answer we’ve heard so far is, ‘No.’ They’re not happy with what happens with EMC, but it’s not painful enough yet to talk about a replacement.”

As with much of the chip industry’s history, it’s always less problematic to extend what is well understood and proven than to move to something that is untried. This has happened with lithography, transistor structures, materials, various manufacturing processes, as well as EDA tools. That, in turn, affects how quickly new approaches are added and adopted. Industry insiders still refer back to past shifts, such as the changeover from aluminum to copper interconnects at the 130nm node, or from planar transistors to finFETs at 16/14nm. These kinds of moves are especially difficult as reliability concerns increase, and they are even more time-consuming and costly.

“Chip last, RDL first, will only come around when chip first runs out of steam,” Arnold said. “So if things like quasi-zero die shift proves successful in process flows, it will delay chip last because those processes are already known. So if they can hit their targeted dimensions in RDL, then chip first will go as far as it can go. After that, you’ll see chip-last kick in. Chip last is just for those applications where you need tight RDL and high density, and where they cannot tolerate any shift.”

Scaling vertically also creates thermal challenges that need to be addressed. This is true even with finFETs and gate-all-around FETs (nanosheets, nanowires, etc.), on a planar die, where the dynamic power density can become so problematic that only some of the transistors can be used at any one time. But the problems are more challenging as chips are stacked on top of one another.

“There are a lot of hidden effects, so even though you have a ‘proven die,’ you’ve never tested it in this package,” said Warren Wartell, senior director of global test services at Amkor. “You may have localized heating, different pressure gradients on that package, which is causing to shift in different ways than you expected. So you need to have ‘qualified for heterogeneous integration’ die, and those become your standard building blocks to make some of these system-on-chip or system-in-package types of devices. You need to be testing in context, and testing enough so that you’re really exploring processor corners. It’s not about, ‘We’ve got a good lot and everything is great.’ It’s when you run into problems and you’re questioning why it’s failing. Maybe it’s because you never really explored your process corner well enough to know that you have some sensitivities there. Those may be harder to simulate, and it takes more work before going into high-volume production.”

The road to chiplets
There are many types of packaging available. In the past, a package did little more than protect the electronic circuitry from damage. But package technology itself is becoming more customized. Evelyn Lu, director of marketing and communications at ASE, pointed to a variety of applications for system-in-package in a recent blog that even several years ago would have been done on one or more chips on a PCB. But demand for a smaller footprint in applications such as hearables — hearing aids, Bluetooth earbud, smart watches, and smart glasses requires the integration of multiple chips in a very small package that uses very little power. “For example, more than 30 components can be integrated onto a single chip with a size of 4mm x 8mm, or 4.55 mm x 9mm, vastly reducing the product size and its overall weight by 1 gram or more,” she wrote.

Fig. 3: Hearing aid SiP and module. Source: ASE

This can be further accelerated with chiplets that can be characterized and connected using industry standards, which currently are under development. The goal is to add flexibility into designs, reduce time to market, and significantly cut the NRE needed to develop electronic systems.

“Throughout the first 20 years of my career, we were largely doing monolithic SoC integration,” said Kevin Zhang, senior vice president of business development at TSMC. “You’d bring all the functionality together in a single die — CPU, GPU, memory controller. But now people realize that has reached the limit. So you break it into pieces, which we call chiplets. Sometimes you can choose different technology options optimized for specific functions. That’s only the beginning. This all started with HPC, because that’s where you get the biggest gain for the buck at the moment. But in the future we will need the volume, and volume usually comes from consumer electronics, whether it’s a phone or a PC. That’s the tip of the iceberg, and in the future we hope more and more products — especially the mainstream consumer products — can benefit from this new chiplet integration scheme, whether that’s cost, power, or form factor, as those product applications move to this kind of scheme. We’ll bring the volume up, but we’re not there yet.”

One of the key elements for boosting chiplet volume will be a predictable way of interconnecting these hard IP blocks. There are multiple industry efforts underway to make this happen, one from the Open Compute Project’s ODSA and another from the Universal Chiplet Interconnect Express group. Government agencies around the globe are developing their own schemes, as well.

The challenge going forward will not be that there are not enough options to move forward with custom and semi-custom designs, or that Moore’s Law is running out of steam. The bigger hurdle will be figuring out which of many possible options will work best, or at least good enough, for a particular application and end market.

If previous history is a guide, ultimately the chip industry will narrow down the number of possibilities in an effort to achieve economies of scale and reduce time to market. This is the essence of Makimoto’s Wave, which has held true for much of the chip industry’s history. But there are many more variables to digest, and many more on the horizon, as well as a bunch of developing markets that either never existed before, or which never relied so heavily on advanced semiconductor technology. As a result, it may take significantly longer this time for chip design and manufacturing to swing back toward commoditization.

Leave a Reply

(Note: This name will be displayed publicly)