Everyone’s A System Designer With Heterogeneous Integration

Engineers are encountering more unknowns, working with different people and tools, and focusing on new types of tradeoffs.


The move away from monolithic SoCs to heterogeneous chips and chiplets in a package is accelerating, setting in motion a broad shift in methodologies, collaborations, and design goals that are felt by engineers at every step of the flow, from design through manufacturing.

Nearly every engineer is now working or touching some technology, process, or methodology that is new. And they are interacting with skill sets that in the past existed in another silo, sometimes somewhere else in the world. Even the lexicon is changing as engineers try to explain the difference between 3D-ICs, 2.5D, system-in-package, and various types of fan-outs.

Behind these changes are several key drivers. Among them:

  • Costs have been rising at each new node since the introduction of finFETs, and they are becoming even more costly with the introduction of gate-all-around FETs and high-NA EUV at 3nm and below. That makes scaling an entire SoC uneconomical, because the number of units that need to be sold to recoup NRE costs either must be high enough to justify scaling, or the benefits of scaling need to be considered in the context of a much larger system, such as a processor in a hyperscale data center where the design and manufacturing costs may be offset by the need for fewer servers at significantly lower power.
  • More and different features are required for specific domains and use cases due for competitive reasons, but chips already are bigger than the current reticle allows. That means they either have to be stitched together into a bigger SoC, or they need to be decomposed into one or more functions and integrated in some type of advanced package scheme.
  • Yield is generally higher for smaller chips, which in theory can reduce the overall cost of a multi-chip/multi-chiplet design. But the yield benefits also may be minimized when one or more chips/chiplets in a package fail, which is why there has been so much attention on setting standards for integration and interconnects, as well as new and better tools for designing and simulating these increasingly complex systems, and better processes for handling, cleaning, and bonding/debonding

John Park, product management group director in the Custom IC & PCB Group at Cadence, said design costs for the most advanced chips, including both digital and analog/RF content, can run as much as $1 billion. “On the digital side, what you want to fit into your SoC doesn’t fit due to maximum reticle limits,” he said. “And by the way, even if it does fit, there is a yield problem. That drives up costs.”

Fig. 1: The ‘why’ of chiplets. Following Moore’s Law alone is no longer the best technical and economic path forward. Source: Cadence

A good starting point for any discussion about heterogeneous integration and advanced packaging is an agreed-upon terminology. Probably the most common use for the term heterogeneous integration is the integration of high-bandwidth memory (HBM) with some sort of GPU/NPU/CPU, or some mix of all of those.

“We used to have packaged die on a PCB connected through DIMM cards,” Park said. “Now we’ve stacked that DRAM. We put it right inside the package next to the processor. With huge improvement in memory bandwidth, people are stacking wafers on wafers. Form factor also comes into play here.”

That form factor may determine what type of packaging is used and where the processing elements, memories, and I/Os are placed.

“It can be stacked, it can be sitting next to each other, the die can be of any material,” said Kenneth Larsen, product management director for Synopsys’ EDA Group. “We typically focus heavily on digital — advanced CMOS — but there are many other die that we are taking into consideration when we are building systems. There are interposers, there are different integration schemes, and then there’s the technology co-optimization.”

To fully realize multi-die design, the chip architects and designers need a good understanding of how multiple individual dies/chiplets behave once they are integrated into a more complex system. Each die contains different functions, sometimes developed at different process nodes, and often including different types of circuits. As a result, they may have different threshold voltages, generate different noise which other chips/chiplets may be susceptible to, and they may behave differently when they are heated by higher utilization of logic, such as with AI/ML.

In addition, they may be connected together using a variety of interconnect schemes, from wirebond to hybrid bonding, and potentially susceptible to stresses that can warp packages and die and shorten their expected lifespans. In some cases, those stresses can break the bonds and cause malfunctions in an advanced package. This becomes particularly problematic where the substrates are thinned out more than in a planar configuration.

Fig. 2: Multi-die system design. Source: Synopsys

“Once that ASIC gets past a certain size, it starts to become interesting to think about breaking it up, purchasing some of those building blocks no longer as IP for the gigantic ASIC, and to start to think about purchasing them as actual chiplets that you can co-package together,” said Stephen Slater, product manager for high-speed digital simulation technology at Keysight. “There are companies that have been very successful at this already. They’re the guys who are putting out chips for AI and hyperscaling, such as AMD and others. What it means for the whole semiconductor ecosystem is that a lot of these smaller IP vendors now start to consider what it means to tape out a chip with a certain interface like UCIe or Bunch of Wires. That’s going to be the shift to the ecosystem that will be quite different. All of a sudden, you’ve got these different IP vendors that can offer their IP at the silicon node you care about, but now they also have a product offering that is an actual chip that can be integrated with other chips. That’s where we see things heading, and there are a lot of new technologies at play here. People start to introduce things like silicon interposers or a glass substrate to get really fine pitches to connect at a high density of connections from one chiplet to the next. That’s where a lot of the EDA simulation tools come to play. How are we going to deal with those new issues?”

The answer increasingly involves co-design and co-optimization of technologies, designs, packaging and systems. What makes this so challenging are all the steps an ASIC designer of monolithic chips may not have dealt with previously.

“In DTCO, design and optimization happens at the same time, both at the circuit level and the technology level,” said Roland Jancke, design methodology head in Fraunhofer IIS’ Engineering of Adaptive Systems Division. “Now it is even extended to systems, so it’s system technology co-optimization (STCO). Especially in 3D integrated and chiplet-based systems, there is great potential seen in this integrated approach. How will you do that — to design from the transistor itself, over the gate cell, the IP block, the ASIC, the system-in-package, up to the applications, where everything is going to be used, and bring all that together in one optimization cycle. You need different models, at different levels of abstraction, and then you need to put this all together.”

This is the next challenge for the chip industry, namely how to integrate these various chips/chiplets and make the whole system work as well, or nearly as well, as a monolithic SoC. “They can be connected in a number of ways, the most common being 2.5D and 3D, where 2.5D is defined as chiplets being connected via some type of interposer or substrate between dies, and they are usually connected together with some sort of PHY,” said Saif Alam, vice president engineering at Movellus.

But there are so many options in the tools, flows, and methodologies that it makes it hard to take all of these factors into account. “There are no common standards for either in terms of multi-die solutions, although there is an initiative by Siemens and others to try to get a ‘universal language’ between all these different tools,” Alam said.

And where common threads do exist, they may vary by foundry or standards group. This includes TSMC’s 3Dblox, Samsung’s 3D CODE, CDXML from the Open Compute Project, or the proprietary solutions from large chipmakers. So while the idea is a LEGO-like universal plug-and-play, the industry is a long way from realizing that capability.

For example, a single digital twin model of the entire package assembly is needed in order to drive system-level co-design across all levels of the package substrate hierarchy, said Keith Felton, product manager for the embedded board systems division at Siemens EDA. “This digital twin model must also provide a system-level netlist that consists of each hierarchy’s required interconnects. The most appropriate format is SystemVerilog. This digital twin model needs to be constructed and optimized before any level of physical design, such as P&R, takes place. Otherwise you will end up with a sub-optimal overall implementation.”

Likewise, Movellus’ Alam contends that a system-level netlist is needed, with a representation for the entire design. “Then, for design exploration, we need a tool that has the ability to move logic across chiplets as needed, based on some user-defined cost function. The tools for validation, simulation, sign-off (timing, EMIR, physical verification) need to have a data model or ‘language’ that can be shared.”

The people who used these tools historically were system-level designers. “Now, when we go to multi-die, everyone’s a system designer,” Cadence’s Park said. “You’re no longer just an ASIC designer. Everyone needs to be a system designer, and they need to understand things like chiplet-to-chiplet electrical compliance and signal integrity at this level, because you’ve disaggregated this and you’re connecting it back up with UCIe or BoW or AIB. So you need to validate that electrical connection from die-to-die, and that uses signal integrity technology, which is 50 years old for PCB design, but newer if you’re coming from the world of designing monolithic chips.”

Along with DTCO, STCO is an increasingly vital piece of the heterogeneous integration puzzle, Synopsys’ Larsen explained. “Looking at what a system is, we have architectures, we have 3D integration. We have the functionality and workloads of the systems that we are designing, the physical and logical aspects of a system, how to provide power through this entire system. And we need to make sure it works in all the conditions and markets that our customers are looking for. When we are looking at a system like this, this is one package essentially. But how do we interconnect all these pieces together for this system? There’s this abstraction in between design technology and system technology around the interconnectivity between the pieces in the system, both when you’re building a system like this in manufacturing, but also when you take the product into the field to make sure you have reliability. What we do with STCO for 3D-IC and for multi-die design is take the system view, identify all the constraints that comprise the system, and try to identify bottlenecks that would prevent performance or area reductions. We run software workloads to try to figure out if this will provide the PPAC, which is actually a volumetric metric because it’s all of it. It’s not just PPAC. It’s the cube of that. What this means is becoming more obvious now when we’re looking at power, thermal, and performance, and look at all these topics simultaneously. This is really where things are getting complicated.”

The challenge is in the details and the exchange of data when it comes to chiplets, which is one reason why most of the chiplets developed so far are internally developed by large chipmakers. Industry efforts to commercialize chiplets will require standardized ways to connect those chiplets, as a starting point. “The industry effort around chiplets is more focused on standardizing the protocols, which is where UCIe, Bunch of Wires (BoW), and advanced interconnect bus (AIB) are coming in from the industry,” said Hee-Soo Lee, high-speed digital segment lead at Keysight. “This is where we thought that chiplet is different, not just from the packaging standpoint — it’s the same as the old SiP, etc. But there is an industry effort coming in and making everything more standardized.”

Moving to multi-die design
With so many options for heterogeneous integration in advanced packages, how can the user community be guided toward a cohesive methodology?

Movellus’ Alam said there are multiple factors to this. “The industry needs to work together to define a common interface between dies, whether it is UCIe or some other standard. For different die connected together, they need to have the same data pitch, which needs to be pre-planned and aligned. The major tool vendors need to collaborate and create a common language for ease of tool interoperability. And the manufacturing cost for advanced packaging required for chiplet implementations needs to go down so this is not dominated by large companies with deep pockets only.”

Siemens’ Felton said one way to achieve this is through a cloud-based virtual lab that allows users to explore multi-die co-design using a controlled approach with preset exercises. “They can do this without needing our software or licenses, and it is free of charge,” he said.

But what isn’t clear yet is who exactly will be using these pathfinding type of tools. “It’s different almost everywhere you go because we’re blurring the line between the ASIC designer’s job and the package designer’s job,” said Cadence’s Park. “Some companies think now that they’ve gone to chiplet-based 3D-IC, that’s all packaging, and the package designers need to be doing that. But there are other companies that say, ‘No, that’s still my chip. I’ve just disaggregated, so that’s the IC designers’ job.’ There is no commonality between users. There are some cases where there is a really strong packaging team, and a lot of this will be passed to the packaging team. If the packaging team is perhaps not as strong, they’ll try and do it within the ASIC design team. A front end tool is really there so that an ASIC design background or system design background doesn’t matter. You still need that common tool to bring it all together.”

The same is true for flows and methodologies. “Some customers are very much, ‘I’m going to do this myself, I need your design guidelines and your register maps,'” said Paul Karazuba, vice president of marketing at Expedera. “‘Tell me what your IP is going to look like. Ship me the RTL and don’t bother me. I’m going to do all this myself.’ Others require much more design assistance, where we may actually go in and help them with their design. They’re curious about how are we interacting with the basics that you would assume. What are the signals coming in and out of their IP? What do I need to give you? What’s your clock, etc.? These are the types of things you would expect, but the reality is that NPUs don’t exist in a vacuum on a chip. They’re not a completely separate function from the rest of the chip. They’re highly integrated with other things on the chip, like the image signal processor block, for instance. Increasingly, those two systems are becoming intertwined, but yet they are typically licensed as two different things, often from two different suppliers. Now you get into how the handshake really happens, because these are not existing on these metaphorical islands as they may have existed in the past. In short, it’s customer-based. It’s how much do they really want us to be involved. As an IP provider, the skill sets that we need to have in-house are different than they were 10 years ago. We need to have people who are knowledgeable of chip design, and Expedera is not a chip company. We are not going to be a chip company, but we have chip designers on staff for this exact reason — to assist people with the design questions that they have.”

Avoiding traffic jams
One of the key goals in any heterogeneous integration is smooth movement of data, which frequently comes down to coherency and throughput.

“We have two categories of people we’re dealing with when it comes to chiplets,” said Guillaume Boillet, senior director of product management and strategic marketing at Arteris. “There are those people who are doing chiplets because they want to reap the benefits of cost and scalability, or even management of portfolio. In those scenarios one vendor is involved. It’s the same company. It’s always one architect who is overlooking all the aspects of the design. The second category is for those people who really embrace multi-die, because they believe in the ecosystem play. But even there, it’s mostly partners. It’s not vendors who don’t know each other.”

Automotive is a new entrant in these relationships. “There are developers that really want to do multi-die because, all of a sudden, they don’t have to do all the pieces of a system where they don’t have all the competence,” Boillet said. “Even there, most of the time, the ownership is centralized. There’s always a company that is leading, whether they’re doing the higher-level chiplet or they own the accelerator for automotive or the accelerator for AI. At RTL or system level I don’t see too many things that are different compared to making choices for an SoC. There are only a few aspects that are to be considered on top of SoC design, which are the tradeoffs that are going to be limiting the amount of traffic across chiplets. Obviously this needs to be taken into account. There’s going to be the coherency aspect, too. So for those people who want coherency across the chiplet, we need to make sure that not too much traffic is going through the interface.”

Everything here is new to somebody. As Cadence’s Park noted, “If you’re an ASIC designer, what’s new is multiple chiplets so you have to have a front-end planning tool. You have to understand what interface to work with. How do you partition your design? Now it’s multiple chips, and to validate you need to understand signal integrity so that you can make clean connections across the chiplets. It’s a whole new world for the ASIC designer. It’s the same thing with the package designer. They need to now understand formal sign-off of DRC and LVS and how that’s important along with working with different materials like silicon. Historically, package designers worked with laminates and a little bit of ceramic and now they’re working with silicon, and that requires understanding the restrictions on metal fill, metal balancing, and formal sign-off. Everybody is learning.”

Related Reading
Mechanical Challenges Rise With Heterogeneous Integration
But gaps in tools make it difficult to address warpage, structural issues, and new materials in multi-die/multi-chiplet designs.
Die-To-Die Security
Cyberattack surface widens with heterogeneous integration, and those attacks become more lucrative.
Why Chiplets Don’t Work For All Designs
Getting this wrong can increase power and cost, while reducing performance.

Leave a Reply

(Note: This name will be displayed publicly)