3D-ICs May Be The Least-Cost Option

Advanced packaging has evolved from expensive custom solutions to those ready for more widespread adoption.


When 2.5D and 3D packaging were first conceived, the general consensus was that only the largest semiconductor houses would be able to afford them, but development costs are quickly coming under control. In some cases, these advanced packages actually may turn out to be the lowest-cost options.

With stacked die [1], each die is considered to be a complete functional block or sub-system. In the future, this will include chiplets. The best-known example today is high-bandwidth memory (HBM), but many other examples exist where a systems company has disaggregated a system into multiple dies. [For the purposes of this report, stacked dies may be logic on memory or logic on logic, but designs constructed using vertically stacked transistors, or designs where a single function is folded across multiple dies, are excluded.]

The early adoption of 3D-ICs was largely based on necessity. “Single monolithic dies have reached their reticle limit,” says Vidya Neerkundar, product manager for Tessent at Siemens Digital Industries Software. “If someone can fit all their functionality within a single die, they will still try to get there. But the development of technology, and all the functionality that you need to fit, means this is becoming increasingly difficult. The natural progression is 2.5D and 3D. The industry has been using multi-chip modules (MCM) for the past decade, where they’re going from one chip to another chip using substrate connectivity. That’s good if you have a small number of signals that need to transfer from die to die. But when you need to communicate with a bunch of other dies, an interposer layer is needed instead of just a substrate. They can break up the die into smaller pieces and they can communicate via the interposer. The interposer can be organic, it can be silicon, it just depends on the application that chip is designed for.”

Source: Siemens Digital Industries Software.

This is proving to be beneficial to a number of companies. “AMD is a good example,” says Marc Swinnen, director of product marketing at Ansys. “One of the principal benefits they get from the chipletization of their design is flexibility. You used to design a 16-core chip and then a 64-core chip. They were separate designs from separate teams. Now they just design the CPU core as a chiplet, and you put 16 or 32 of them on an interposer. It’s just the interposer that changes, but the chiplets stay the same. It gives them a lot of flexibility to come up with derivative products that all use the same basic building blocks, and so that flexibility is what they really value.”

That also saves costs over time. “Suppose you’ve got a module that doesn’t need to be retargeted to 5nm, or memory that cannot benefit from it,” says Isadore Katz, senior director for marketing and business development for Siemens EDA. “It was doing everything needed at 7nm, so leave it there. In many cases you can leave the interposer as it was, just reconnect the new stuff. This will reduce the cost of bringing in a better node or process. It should also buy you some level of immunity as you iterate on a family of parts at a particular process node.”

Some of those cost savings are less obvious. “You also have to measure cost in terms of how much faster you’re coming to market,” says Siemens’ Neerkundar. “If you’re late in the game, you’ve lost the advantage of getting your product into the market ahead of anybody else.”

It can be difficult to properly assess total cost. “There’s a cost advantage even though there’s more work to be done,” says Kenneth Larsen, product management director for the EDA Group at Synopsys. “The industry has been working on getting as much stuff into a single die as possible with on-chip communication, making the distances very short. But when disaggregating, we actually go the other way. We are going off-chip for communication. You have to ensure that you’re not losing the benefits you had in the past in terms of performance.”

Additional costs
The first time a company attempts a 3D design, there will be some additional costs and organizational challenges. “Whereas you had a packaging team somewhere in Bangalore, and a design team in Haifa, and a top-level architectural team in Austin, now these guys have to be pulled together for 3D-IC assembly,” says Ansys’ Swinnen. “You need to have the expertise integrated into the same team. Thermal is good example. You had a thermal guy in the packaging group somewhere. Now you need the thermal person in every design group, or they have to be multiplexed across those. There is an organizational realignment, which will probably increase costs.”

New tools and skills are also required. “You need a mechanical tool to analyze the shear stresses in your 3D stack,” says Swinnen. “You need a thermal tool. You need a 3D electromagnetic tool to analyze the long traces on the interposer. These compound the cost, even for the digital guys. There’s always been EM in chip design, but it was for the RF guys, not regular chip designers. There was always thermal, but it was the guy in the packaging team who checked to make sure things were okay. Now all these tools are part of the mainstream flow.”

The interposer requires both chip and PCB type skills. “Somebody has to design the interposer,” he says. “It looks like a giant chip, but it’s very high-speed, includes long distances for a chip, at least several millimeters. It becomes an EM problem. Even though you’re not an RF designer, the high-speed signals have to be analyzed like a high-speed circuit with full electromagnetic coupling. Some people try to approach it from the PCB side of things, but then they don’t have the capacity to handle the tens of thousands of wires that are on the interposer.”

In addition, there are new conditions you must be aware of. “If you are thinking longer-term, where you might re-use chiplets, you have to consider different boundary conditions,” says Synopsys’ Larsen. “Maybe you put something on top of the die, or maybe you put something very close to the die itself on a stiff interposer. Maybe it’s silicon and that will cause stress — not only thermally induced stress, but mechanical stress as you go through manufacturing and when you use the product. There’s a bunch of new areas and interesting problems. For thermal, you don’t have to be a rocket scientist to think that if you have something that’s hot, and you stick it next to something else that’s hot, that needs to be managed to make sure it’s performing over the useful life of product.”

But it is not all bad news. “Test cost is going to go lower because of divide and conquer,” says Siemens’ Neerkundar. “Instead of a single big monolithic die, now you’re breaking that into smaller pieces into your chiplets or dies. You can do things in parallel, which you couldn’t do before. The parallelism is going to improve your time to market, improve time to quality of results, and so you can concentrate on how to improve this more efficiently by working things in parallel. Over time this will get even better as standards come into play. Standards like IEEE 1838 describe how to communicate between the stacks, and they also include a flexible parallel port through which you can communicate with them. It is an extension of what you were doing hierarchically within the 2D.”

Thinking differently
Some aspects of adoption require a change in the thought process or methodology. Commercial chiplets will be sold as black boxes, but if they are purchased from different vendors or produced in different foundries, the characterization of those chiplets may vary. In addition, commercial chiplets are expected to work across a variety of applications and use cases, and it will take time before all of the relevant data is collected and analyzed.

Architectures, in particular, will be have to account for a base level of uncertainty. “This is especially true if the industry gets to super-NoCs, hierarchically designed across chiplets, which would mean top-down design,” says Frank Schirrmeister, vice president of solutions and business development at Arteris. “This step is akin to co-designing various chiplets that need to work together, primarily adopted for highly complex chiplet-based structures, less in an ecosystem of a third-party chiplet market. From a top-down perspective, participants in a chiplet ecosystem need to plan out far into the future, not unlike IP players today, to understand the requirements dictated by the end applications the chiplets are designed for.”

When you don’t fully know the boundary conditions, you may have to over-design to make it reusable. “For the big companies that are building chiplets and re-using them within their own company, not selling chiplet IPs externally, this may be more manageable,” says Larsen. “But if you imagine in the future that you can buy bare dies, with ultra-fine pitches as chiplets that you can integrate into your system, things will have to be designed differently, just like IP designs.”

This also has a direct bearing on verification. “With smaller chiplets, verification becomes much more cost-effective,” says Arteris’ Schirrmeister. “But you do have more variation of it when combining the chiplets together. Full chip emulation becomes a different animal, and you have to do your hierarchical splits correctly to support sub-system verification and hierarchical development. If all the analysis is correct, the cost should be more efficient in relationship to the value you get for chiplets, because you have these smaller entities where the verification effort and the implementation effort stays more constrained.”

Interfaces become an important discussion, as well. “Do you have some sort of protocol, such as Bunch of Wires (BoW), that is to be used for vertical communication?” asks Larsen. “What you choose is dependent on the design and involves issues such as how to deal with timing. Some companies would rather not have too many PHYs and protocol overheads, because if you’re using hybrid bonding, it is really a buffer. If you have tens of thousands, or hundreds of thousands of connections, I’m not sure you want the overhead of a protocol for those signals. Maybe you will have some protocols for timing.”

Others are looking at much more significant protocols. “Some companies are looking towards the UCIe standard,” says Neerkundar. “There are other standards that are in the working group stages that are trying to decide what to do for test or for repair. The industry is trying to reduce the burden at the system level by having standards, so that each designer or group can adhere to the standards, and then the assembly at the system level becomes much more convenient.”

These connecting components must be verified and likely acquired from an IP company. “The connection needs to be confirmed, and in a truly open ecosystem, the NoC protocol on both sides needs to reflect the same capabilities,” says Schirrmeister. “A user may say, ‘I need Read Data Chunking.’ How does my controller support it? Do both chiplets have that capability in the AXI implementation? Eventually, the industry will probably see UCIe plug fests, as PCIe did. They will just be much more involved, as there are no plugs per se. In a proprietary environment, when a design team owns both sides of the connection, they can negotiate and align the support tailored to their design.”

It probably costs more today. “The benefit is on the manufacturing side, the yield side, but you take a hit on the design side because it is more complicated,” says Swinnen. “If you look at who is using this in the market today, who is doing sophisticated 3D-ICs, it’s only the really big guys — IBM, AMD, and NVIDIA. A lot of the mainstream is still just dipping their toes in. They are doing some basic 2.5D, but it is slow because they have to ramp up their tools, and their expertise in the organization.”

Automation will help. “If we want to fully take advantage of 3D systems, we need to provide automation,” says Larsen. “Over the few years, all the steps that were done manually for packaging are going to be automated. This is quite new because those layers have not enjoyed much automation since the dawn of electronics.”

Disaggregation has become necessary for some companies, either because of design size or for manufacturability issues. But it also can be used for commercial advantage, creating multiple product variants quickly and cheaply. Over time, additional benefits will be garnered from heterogeneous integration of disparate technologies.

HBM has shown there is a viable market for a third-party chiplets, and that is allowing many of the issues to be worked out and a suitable methodology developed. The question is how many other functions can successfully follow this example?

While development costs are higher today, there is plenty of evidence to show that those costs are decreasing rapidly, and 3D-ICs utilizing chiplets may well end up being a lower-cost solution. But for many, creating a single die still will remain the path they choose to follow.


  1. There is much confusion about terminology when it comes to stacked die, because terminology often is used interchangeably. A true stacked die is built vertically in layers, either using an interposer (2.5D) or some type of substrate (3D-IC). There also are pillars in various fan-out configurations, and planar 2.5D implementations.

Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Preparing For Commercial Chiplets
What’s missing, what changes are underway, and why chiplets are increasingly necessary.


Gretchen Patti says:

This is a terrific analysis of the state and direction of cutting-edge chip design. Thank you!

Leave a Reply

(Note: This name will be displayed publicly)