Designing For Multiple Die

Why a system-level approach is essential, and why it’s so challenging


Integrating multiple die or chiplets into a package is proving to be very different than putting them on the same die, where everything is developed at the same node using the same foundry process.

As designs become more heterogeneous and disaggregated, they need to be modeled, properly floor-planned, verified, and debugged in the context of a system, rather than as individual components. Typically, this begins with the full specification of the system from a high level of abstraction. The specification is then divided into blocks and assigned to individual designers in order to optimize their designs concurrently. Finally, all subsystems are put back together, verified, and tested as a whole.

In the simplest designs, where there are few chiplets and relatively simple interconnects, the design process is comparable to an SoC with several big blocks. “Separate teams agree on things like shape and area, pin locations and their connections,” said John Ferguson, marketing director for DRC applications and Calibre Design Solutions at Siemens Digital Industries Software. “At least for digital designs, this approach extends existing place-and-route technology. But it gets more complex with each additional chiplet or interconnect added.”

Initially, the move to heterogeneous architectures was driven by systems companies looking to boost performance for their particular data types, while saving as much energy as possible. Now, as chipmakers seek to extend that level of power and performance optimization into more markets, they are looking at ways to standardize and simplify that optimization, and to make it significantly more cost-effective.

“The fundamental macro change that is happening is the silicon disruption,” said Shekhar Kapoor, senior director of marketing at Synopsys. “Until recently, life was good, and you could count on moving to the next node to realize performance and functionality benefits. But now those benefits are diminishing and cost is prohibitive, hence you really have to consider disaggregation, and look at heterogeneous integration from cost point of view. Disaggregation is essentially a single die getting split into multiple dies which is the main change driving multi-die design. This is combined with the whole IP reuse concept. The chiplet took a form of what used to be a block inside a major die. Now, if you’re splitting up a chip into multiple dies, one of those blocks or dies is a chiplet, which you basically re-use in your next design. ‘Multiple die’ captures and encompasses everything, where ‘chiplet’ focuses on the reuse aspect of it.”

This requires a very different way of thinking about chip design, though. “As soon as you go from a single monolithic chip, which the industry is doing, and go to multi-chiplet designs, you first need to bring the concept of a system-level aggregation tool into the flow,” said John Park, product management group director in the Custom IC & PCB Group at Cadence. “You’re not designing one thing. You’re designing a combination of multiple things, multiple chiplets, and their packaging arrangement.”

Optimizing chiplet-to-chiplet connections is essential. But it also needs to be viewed in the context of other chips and IP, and potentially other systems.

“You need to validate at the system level — not at the chip level — that Chiplet A is connected correctly to Chiplet B through the packaging,” said Park. “That’s the switch to what I would consider system-level design. That’s step one in moving away from monolithic chips. That’s the first thing you have to do — put in a tool that allows you to assemble the system and optimize it. That becomes your golden netlist to drive system LVS (layout versus schematic), which is critical. The system LVS people oftentimes get halfway through their flow and say, ‘How am I going to validate this?’ And if they haven’t started the design in the right way, they find themselves in a lot of trouble.”

Some of the biggest challenges and limitations involve power, thermal, stress, and EM-IR elements of the design.

“With only two dimensions, these are more easily addressed,” Ferguson noted. “But the more stacking or sophisticated ways of connecting chiplets, the more difficult these challenges become. I expect we will come to a point where each of these has constraints, with some reasonable guard-band to prevent issues. But with so many different possible combinations of ways to connect these things, there will be that many more of these constraints that need to be determined, so it grows far more complex and involved with each stacked/connected item.”

Another consideration is that multiple dies does not always imply chiplets. “Sometimes it’s dies, and sometimes it’s chiplets,” Park said. “Up until three or so years ago, it was multi-chip modules (MCMs). Now we’re saying multi-die modules. It’s about taking the die out of their package components and mounting bare die on a laminate substrate, and that was system-in-package (SiP)/MCM modules. That doesn’t go away just because we’re going into the world of chiplets.”

Fig. 1: SiP/MCM vs. chiplet-based (heterogenous integration) architectures. Source: Cadence
Fig. 1: SiP/MCM vs. chiplet-based (heterogenous integration) architectures. Source: Cadence

Smart phones have included SiPs for years, especially for RF and analog components. “This certainly was heterogeneous integration,” Park said. “But we had no concern for what node they were built at, what technology they were built on. In the past, we just didn’t use the term heterogeneous integration.”

Chiplets are the next growth phase for this approach. “The hope [at Si2] is to try to create some standardization around the chiplet space because it’s really new and seems important going forward — not just in the digital, but to stack memory, as well,” observed Matthew Ozalas, master application development engineer and scientist at Keysight. “And when we look to 6G systems in wireless, chiplets may be the only way to get there. Usually what happens in the evolution is that it starts out with the lower level, or digital, because chips can be built with functional blocks already. The last frontier is always high-frequency RF microwave, and it looks like that’s going to be the case in chiplets.”

The reason is this high-frequency RF is not a standardized design process. “If you look at a digital chip, there are billions of transistors in these chips, and nobody can design at the transistor level there,” Ozalas said. “So people build these functional blocks, and they integrate nicely together. For instance, they’ll build an adder block in their digital chip and will stick the blocks together. They’re already doing this block functional/block level design. If you’re a digital designer, you’re not really working with the transistors. The only time you really encounter the transistors is when you run into reliability issues, or when there’s a problem with one of the transistors where it’s sucking up too much current or getting too hot, or causing a problem with your various latch-up things. Analog follows behind that, and has sets of functional blocks, as well. Then, when we get to RF and microwave, it’s almost all transistors. The people who are designing there are working with transistors. It’s very touch-and-go. As much as we’d like in a system to have a low-noise amp, these components are high-frequency. They really are functional blocks, but they’re not so standard. If the transistor topology technology changes, those things don’t just scale down with it. So everything needs to change.”

This is why the high-frequency blocks end up being the last frontier, he said. “It’s everywhere you look, and that’s the case with chiplets too. If you’re building out a chiplet, you can build a functional block with high-frequency circuits. But it’s harder to put them together and have them work perfectly.”

Cost is a growing consideration with chiplets, too. “People are designing for the end of Moore’s Law,” said Cadence’s Park. “They’re moving off monolithic, huge SoCs and ASICs, to a disaggregated or modularized approach, where the IP on these big chips has been broken out into chiplets. Here too, each chiplet can be designed on whatever technology makes most sense.”

That simplifies things, in some ways, because conventional I/Os such as PCIe or SerDes can still be used in conjunction with leading-edge logic. “That’s probably not going to change the I/O connectivity between what you use on the board versus what you use on the die,” said Wendy Wu, product marketing group director in the IP Group at Cadence. “But for a chiplet approach, design teams would use the more emerging die-to-die I/O, which is very low power and probably did not exist five years ago.”

Chiplets typically are integrated usually side by side, but they also can be stacked in a 2.5D package, using an interposer, or in a 3D-IC. Park does not expect packaging with silicon interposers to continue, and believes there a push toward organic interposers and interconnect bridges.

Extra questions with multiple die
An important consideration with multiple die systems is co-design. “When engineering teams start putting these systems together into assemblies, they can’t know what the limits are until they define what that assembly system is going to be,” said Joseph Davis, senior director for Calibre interfaces and mPower product management at Siemens Digital Industries Software. “One of the great things people want to do now is put together chiplets from different manufacturers. That becomes a system problem, where all the models and limits are now crossing from different foundries and going to a third party. This is incredibly challenging from an IP standpoint. If you really want to push the boundaries of what you can do from an integration standpoint, you’ve also drawn a box around what you can do because now you have to do everything from a single manufacturer.”

With today’s level of complexity, every stack is unique. “You can’t just say what works for 2.5D will work for 3D,” Davis noted. “When you start building those things, you’ve got direct technology compatibility issues. Even within a single foundry, every time a customer says, ‘Hey, I want to do this stack,’ they have to define, ‘I want to put this chip with this chip, and this chip over here and this interposer.’ Then that foundry has to work with the EDA vendors who are involved to provide all the collateral that go together. You can’t just take standard PDKs and put some baling wire around them.”

Synopsys’ Kapoor sees the first challenge as defining the spec for the product in mind. “This could be your next mobile design, for example, or next data server design. So now you have to break it. How do you break it? From the system functionality point of view, what part is handled by hardware? What part is handled by software? For some customers, it’s easy. It’s just a memory over logic, or logic over memory. But when you split the logic, it complicates the matter significantly. There are many parts it splits into. What are the key components? GPU, CPU, and I/Os. How do you put them into the ideal package? What interconnect fabric are you going to use, which helps you meet certain constraints and specifications? These decisions used to be simple enough to do in PowerPoint or Excel or Visio.”

Now, more sophisticated tools are a necessity for exploration purposes. “These tools must be more sophisticated to bring some analysis up front,” Kapoor said. “Thermal is the classic example. Design teams never used to think about thermal unless they were doing a PCB package design or a system design. Now those come in early, so they must start thinking about thermal as a constraint when doing early architecture design. Once you’ve decided how to break apart the design, then what is the best and most cost-effective configuration from a packaging and connectivity point of view? Will you still meet your PPA? PPA is always going to be there, and now you’re split across the die. How does that come into the picture?”

Enabling chiplets
Over the last five years or so, derivative designs have become much more challenging. “You take a core technology, then there’s a longer kind of a leapfrog,” said Siemens’ Davis. “Then the next one will be a new technology, and then we double it again. So the industry starting saying, ‘Wait a minute, instead of doing that, can I do a chiplet, put it on a substrate so that I can now make two or four or eight without having to do a new tape-out and go through all of that?’ It’s all on the assembly level, rather than silicon level, so that’s very attractive.”

Attractive, yes. Simple, no. The complexity of this approach for a startup, or even a medium-sized company, can be daunting. “With a single die, there are a whole bunch of different integrators out there who can handle that packaging model and integration and work with the foundry. Well, now you’re going to do a custom collection. That’s a much smaller set of people who can do that. Do you bring that in-house now to do that and to verify your functionality, because now there’s much more that needs to be done in system verification? Reliability verification becomes so much harder from a system standpoint, and the make-versus-buy decision is much more complicated. Your ‘make’ decision maybe means hiring five guys instead of one. And by the way, there’s only five guys in the country who have done it before. It’s very much a bootstrap effort. ‘Oh, we need somebody with five years of experience doing SiP and 3D stacking.’ Wait a minute, people have only been doing this for three years.”

Ensuring reliability over a chip’s projected lifetime becomes more challenging. “Because you split the die, you now have many more interconnections, many more interfaces, any of which could fail, any of which could be an entry point for security concerns,” said Kapoor. “There’s been a lot said about known good die, but how do you really bring in the monitoring part of it, and how do you ensure it is all observable, optimizable, and testable throughout the flow? These are the new challenges that have emerged.”

Above this, how do engineering teams do all of this efficiently? If single die design was difficult, multi-die systems point to the need for new models and standardization.

“The traditional way chiplets work is there’s a die-to-die communication wrapper based on some micro-buffers that drive and receive signals, that handle the SD and test, etc.,” said Cadence’s Park. “We see the same things on bigger chips. But now they’re smaller because we don’t have the big capacitive load of going all the way out to the board. What this means is you leave the world of things like timing analysis, i.e., the flop-to-flop timing, where you’re going between two different devices through a hybrid bond. In multiple-die designs, you need to validate your compliance, which could be based on AIB, UCIe, BoW, or otherwise. There are many emerging chiplet standards, and you now need to validate the signal integrity of those. You’re essentially validating transceiver-to-receiver through some interconnect channel that has the right signal behavior, and there’s not too much jitter or noise on that interface. This makes signal integrity a system-level problem, which the industry has been doing for more than 40 years.”

The problem is that chip designers don’t necessarily know how to do this. “Chip designers on the digital side are only concerned about flop-to-flop timing, and that’s very different than understanding the importance of signal integrity types of challenges. For all of those reasons, you need to treat multiple-die designs, including 3D heterogeneous integration, like a system rather than a monolithic chip.”

Multi-Die Integration
Advantages and challenges for heterogeneous integration in advanced packaging.


Riko R says:

Excellent summary. Thank you.
The thing that I find a bit distressing is that the conversation is identical to what it was 6 years ago, when I was assessing exactly these kinds of tradeoffs. It is interesting how changing the way we do things is so darn hard even in an industry that supposedly loves change 🙂 .

Leave a Reply

(Note: This name will be displayed publicly)