Advanced assemblies have enabled an unprecedented rate of advancement in the data center, especially for neural processing, but can it expand beyond that?
The concepts of 3D-IC and chiplets have the whole industry excited. It potentially marks the next stage in the evolution of the IP industry, but so far, technical difficulties and cost have curtailed its usage to just a handful of companies. Even within those, they do not appear to be seeing benefits from heterogeneous integration or reuse.
Attempts to make this happen are not new. “A decade ago, we were trying to create an architecture for building chiplets,” says Mark Kuemerle, vice president of technology and CTO of custom solutions at Marvell. “We had noble goals that were very democratic about being able to define a structure where people could put multiple chiplets together to build a given function. The magic was that by putting these smaller chips together, we could use much less power-hungry interfaces. We could take what used to be a big, power-hungry, complex, expensive system and build it into a chiplet-based system that could be more efficient overall. More importantly, it would save tons of development costs and save a lot in overall chip cost. It didn’t turn out that way. We ended up with a scenario where just a small number of companies, which you can count on your fingers, had the capability to develop chiplets.”
So why do it? One of the big drivers is an increasing amount of content that is vital but not differentiating. “The mass market is still a few years away from adopting 3D-IC, but there are applications where it’s well-suited, and the U.S. government is very focused on those applications,” says Pratyush Kamal, director of central engineering solutions at Siemens EDA. “It makes a lot of sense when you think about 6G wireless communications, because your antenna pitches are shrinking to the scale where you can think of an array patch antenna within the package. You would have the antenna array patch at the top of the package, then you would have an array of power amplifier circuitry. You’ll have beamformer circuitry beyond that, and then you enter into the digital domain where you have data processing and connectivity to your baseline computer. There is a lot of modularity emerging in the market. People are starting to look at new architectures, and as a corollary to that, they are also thinking about how they can enable the mass market.”
In the past, companies pushing for new technology have funded the necessary research and development, which then trickled down to the masses. “Industry leaders are going to be first with this advanced new technology, and the broad assumption is that the industry will follow,” says Marc Swinnen, director of product marketing at Ansys. “But there’s a growing gap between the two. The front runners have outpaced the field and are pulling ahead of the mainstream. That’s worrying in the sense that it is still the general belief that 3D-IC or 2.5D will become the norm as systems get larger, especially if the chiplet market takes off. But as these gaps grow, it opens up the potential for competitive displacement. If you’re two or three years behind a competitor that has mastered this technology, suddenly you get serious differences in market positioning.”
Put simply, there is a bigger separation between those that must adopt it to stay competitive and those that wish to adopt it. “Chiplets enable us to do more cutting-edge stuff, and it’s allowing us to put a lot more silicon on a package, which is helping us scale performance,” says Marvell’s Kuemerle. “We need this because everybody would agree that we have a significant slowing of Moore’s Law.”
This is not the only reason to want 3D-IC, however. “3D-IC technology offers numerous benefits, including increased performance, lower power consumption, and miniaturization,” says Rozalia Beica, field CTO for Rapidus Design Solutions. “Adoption spans various applications, from mobile to high-end uses like AI, supercomputers, and data centers. The technology’s ability to enable compact designs with improved performance continues to attract interest.”
Still, significant challenges remain. “Most of the people using 3D-IC are vertically integrated,” says Ansys’ Swinnen. “They’re bigger companies that have the wherewithal to design the chips, design the interposer, simulate the whole thing, look at the packaging, and go through the many architectural choices that have to be made. It’s complicated and it remains, to a degree, trailblazing.”
Large chip or small PCB?
3D-ICs are not just about shrinking everything on a PCB. “For different benefits, people tend to swap the comparison baseline,” says Swinnen. “That’s not really fair. The PCB going to a smaller system — that was the SoC. If you’re worried about getting better performance than your PCB, then you go to an SoC. That is the natural evolution. It’s what we’ve been doing for 40 years. The disaggregation to multiple chiplets is not because you’re trying to compact a PCB. It’s because you’re trying to take what was a monolithic chip and you’re disaggregating it. The comparison baseline is a monolithic chip, not a PCB.”
But not always. Early success stories, such as HBM, bring more external components into a package. “With the increased need to bring in more functionalities within the package, it is becoming more difficult to achieve that using monolithic SoC type structures,” says Rapidus’ Beica. “Not all functionality requires leading-edge designs. While the leading-edge designs prioritize the highest performance goals and smallest form factors, such approaches may not be the most efficient way when heterogeneity is important, and more functionality is needed within the system.”
Moreover, if chiplets were readily available, 3D-IC could be viewed as being a PCB within a package. “The PCB really limits the amount of bandwidth that chips can use to talk to each other,” says Ramin Farjadrad, CEO and co-founder of Eliyan. “It has only gone up by less than 2 orders of magnitude over the past 20 years, compared to 5 orders of magnitude for a chip. This is a major contributor to the memory and I/O walls. By moving the equivalent of a PCB inside the package (see figure 1), the density of the balls, which we call bumps or microbumps, goes up significantly. The distance between the chips goes down significantly. You get much higher bandwidth at much lower power between these dies.”
Fig. 1: Chiplets are the future of semiconductors. Source: Eliyan
Continued monolithic advancements are being constrained by yield. “People are hitting reticle limits,” says Mayank Bhatnagar, product marketing director of SSG at Cadence. “If you have a large piece of silicon that you want to build, then not just at the reticle limit but much before that, you will start having yield issues. If you can’t make it profitably, there is really no point making it at all. And that is where monolithic dies, which are very large, are headed. They are just becoming too large. The yield drops, and the drop is enough that it becomes uneconomical.”
Doing everything on the leading nodes is not required. “AI is demanding more SRAM on die, and SRAM hasn’t scaled,” says Siemens’ Kamal. “Theoretically, you have the smallest SRAM bit cell in 5nm, and then it starts to grow after that. But if you look at the dollar per bit cell number, that stopped scaling way before 5nm. Even though we were shrinking through 7nm and 5nm, the bit cell side was coming at a higher cost per bit. Two things are happening. You need more SRAM, and SRAM is more expensive. 3D, just because of its closeness, almost zero latency interface between the two dies, allows you to experiment with different hierarchy and cache structures.”
Ideally, the best technology would be used for each component. “We can build enormous chips in cutting-edge technologies, and the chiplets become a crutch to help us do that,” says Kuemerle. “We are breaking things into more pieces so that we can cheat and get more silicon than we could have had in one integrated tape-out. We could ping-pong I/O technology with core die technology so that we can just push the envelope and use the best available technology, which usually tends to be the most expensive technology. We had these ideas before, and they didn’t pan out that way. Some of it had to do with the realities of building a multi-chip system.”
The combination of technology and development costs is driving more companies in this direction. “With the latest process nodes, the cost per transistor is going up,” says Cadence’s Bhatnagar. “It does not make sense to transfer every part of your design into a new process node, because most of the design may not benefit from it. If you have an RF transceiver or analog block, it will not benefit from a reduction in cost per transistor. On the other hand, you will have to redesign it for the new process node. When you disaggregate, you are able to move only the portion that will benefit from a new process node.”
It sounds very enticing, but the reuse of a previously designed chiplet in a new design is fraught with peril.
Cost effectiveness
There appears to be a lot of price insensitivity in the data center. “AI has put such a premium on high-performance, very complex, very large silicon systems that it has been worth it for them to put in the huge investment to chase this market,” says Swinnen. “They have a reason why they need these huge 3D chips. It’s the AI application. And until the technology becomes cheaper, or the other sectors of the market find their own killer app for 3D, it will be a slower go for them.”
The rest of the industry is looking but waiting. “When I talk to our mobile customers, I don’t get the feeling they’re ready for 3D-IC because the economics are not yet making sense to them,” says Kamal. “But at the same time, they understand they are hitting the limits of scaling. They are getting small incremental benefits when moving from 5nm to 3nm to 2nm, and this incremental benefit comes at a huge cost. The only reason they are going to these nodes is to get the maximum performance out of the transistor, and especially driven by the new transistor architecture with gate-all-around (GAA). But GAA is a very complex process. The yields are low.”
The problem is twofold. First, they need to adopt a completely new design and packaging methodology. And second, they go from a single tape-out to multiple tape-outs. “It might make sense for a company to be in a chiplet style of design,” says Elad Alon, CEO of Blue Cheetah. “That means they need multiple mask sets, they possibly need multiple tapeouts, and the initial NRE to do that versus a larger monolithic chip in an advanced node is tough to swallow. The NRE required to get out the door is likely to be lower by sticking to a monolithic solution. It is a complex dance, as is true of many things in engineering. What you would do in steady state, once you’ve already got a large enough market and a large enough business, may be quite different than what you have to do to enter, because the considerations are different.”
There are several technical challenges that still need to be improved. “When I think 3D right now, my mind goes to hybrid bonding, because that’s where you get the real bang for your buck,” says Kuemerle. “It helps you resolve some of the thermal challenges, it gives you very high connectivity and very low power consumption. That involves thinning silicon to extraordinarily small thicknesses and integrating copper-to-copper bonds at very fine pitches. When you look at the physical challenges of doing that with multiple suppliers, that becomes challenging.”
HBM is still trying to get to that point. “3D memory is still only using micro-bumps to connect the memory to the main die,” he says. “The memory providers are working on hybrid bonding approaches that we all hope will go into production in the near future. When we look at doing this real cutting-edge stuff with multiple pieces of silicon, that’s where it’s going to get really interesting.”
Eliminating the PHY potentially adds a lot of performance. “When you go to an almost PHY-less architecture, you’re talking about a very fine-grained 3D interconnect, very small interconnect, and that can only be achieved today with a wafer-on-wafer level stack,” says Kamal. “If you go to die-on-wafer level stack, the interconnect pitch is longer, and that’s where I draw the boundary. You then need special buffers, and it is probably a stretch to be thinking about a PHY-less architecture. Here’s the challenge with any 3D stack — You have the backside metal on at least one die. As soon as you fuse two dies, you still need to take the I/Os out, the power supplies out through the substrate.”
Power density and thermal challenges are becoming well known problems, but there are others to be considered. “Let’s say you have the front end of line, the transistor layer, and the die that has the backside metal to get the I/Os out of your die,” explains Kamal. “Now you have metal on both sides of the transistor stack. What happens as a result is your metrology becomes very challenging. If something needs investigating, you want to use X-ray photography or some kind of visual scanning. That becomes very challenging. Additionally, focused ion beam uses the backside of the silicon to go inside the die and make changes. We call it fibbing the IC when we are trying to debug something and establishing a hypothesis for the failure in the die. You do some fibs, and you redesign your die based on your fib experiments, and then you re-mask and reproduce the design. Now you have lost that ability to do fib from the backside. If you design a chip and it doesn’t function in the field, you have a very big gap in terms of debugging the chip.”
Heterogeneous stacking adds complications. “Heterogeneous integration requires the combination of different technologies, both old and new,” says Andy Heinig, head of Efficient Electronics in Fraunhofer IIS’ Engineering of Adaptive Systems Division. “A significant difference is the signal levels for older technology nodes. Newer nodes require much lower levels, while older nodes need higher signal levels. Achieving compatibility in this area is a challenge. Often, only lightweight die-to-die interfaces are needed, as the integration of digital IP is highly limited in terms of space in older technologies.”
Reuse adds more complications. “For 3D, the biggest constraint is you want the two dies to match in size,” says Kamal. “Otherwise, it is wasted area. You can do die-on-wafer integration, where one die is not quite the same size as the other die, but you’re losing out on how many dies you can process at the wafer-level packaging or integration. There’s a throughput challenge. But one advantage, as an architect, is you don’t have to be homogeneous. It can be heterogeneous. One die can be in 5nm, the other in 3nm.
Conclusion
3D-IC has the potential to transform the IP and semiconductor industry, but it remains a very expensive option, which today is only applicable to the data center — and even then only because of AI. There are many challenges that have yet to be overcome, and it would appear that the notion of 3D-IC for the masses is still something for the future.
Before 3D-IC can move beyond vertically integrated companies, a lot more work is required on interfaces, standards, tools and methodologies. These will be examined next month.
Related Reading
Multi-Die Design Complicates Data Management
Design data and metadata are ballooning, and no one is quite sure how long to save it or what to delete.
Signal Integrity Plays Increasingly Critical Role In Chiplet Design
Chiplet design engineers have complex new considerations compared to PCB concepts.
Chip Architectures Becoming Much More Complex With Chiplets
Options for how to build systems increase, but so do integration issues.
Leave a Reply