Who Benefits From Chiplets, And When

Challenges involving reliability, integration and chiplet availability will take time to iron out.

popularity

Experts at the Table: Semiconductor Engineering sat down to discuss new packaging approaches and integration issues with Anirudh Devgan, president and CEO of Cadence; Joseph Sawicki, executive vice president of Siemens EDA; Niels Faché, vice president and general manager at Keysight; Simon Segars, advisor at Arm; and Aki Fujimura, chairman and CEO of D2S. This discussion was held in front of a live audience at the recent Electronic System Design Alliance event. What follows are excerpts of that discussion. To view part one, click here. Part two is here.

SE: AMD, Marvell, and Intel are using chiplets to create customized solutions relatively fast. Does this model work in a commercial marketplace where nobody owns all the pieces?

Segars: On one level, it’s just another mechanism for delivering IP. It’s hard IP, but it’s back to the future. We used to do hard IP in GDSII instead of RTL. Now it’s in silicon. But there’s an opportunity to get a lot of reuse and help drive costs down, because if you’ve got some compute subsystem that would be useful in a lot of markets, you can stamp that out and optimize it and create a building block for one of these more complex system-in-package designs. The whole point is to be able to make some extra technology with whatever memory, analog, or sensors you’ve got, and to be able to mix and match. That is going to open up yet another dimension of design. This is going to happen. It’s not going to work for all markets, because the cost of putting these complex packages together isn’t going to scale down to a 50-cent microcontroller for a long time. But it’s got huge potential in helping drive up efficiency, drive down cost, and drive up performance,

Devgan: There is natural coupling between packaging and PCBs. Typically, these groups were separate at some of the customers, or even the foundries, and now they’re coming together. Packaging is becoming critical to big foundries and big companies. The first thing that has to happen is those two domains have to come together as PCB/system packaging with the IC. And then the platforms have to allow for these things to play well, with analysis on top of it. But all these things can be done. You need a platform that can do multi-chip and advanced packaging together, along with analysis. We already are working with several advanced foundries. For the foundries , this is a key part of the strategy going forward. It’s great for for EDA and IP and the foundry business.

SE: Still, in the past you were dealing with soft IP that was integrated into a design that was going to be produced in volumes of hundreds of millions of units. Now we’re dealing with much smaller volumes and we have to integrate all these pieces together. How will that work?

Devgan: We talked a lot about design, whether it’s 3D-IC or domain-specific design. The other pieces of this are verification, which is super-critical, and software. We use hardware platforms to drive verification, and that becomes even more critical in 3D-IC. It’s possible that you’re designing five chips, and one of them is in RTL and the other four are in silicon, or vice versa. So even emulation and verification have to go to a different level. Verification improvements are as critical as design improvements. With 3D-IC there is software-level verification, chip-level verification, and then electromagnetics and thermal. Verification, and how good a design team or company is in verification methodology, can become the differentiator going forward. Whether you take one iteration and first-time pass to silicon, or you take four iterations, what is the gap between when silicon comes out and when the product comes out. If you look at the world-class companies, that gap is only few months. It used to be one or two years.

Fujimura: There are a lot of facets that have to be considered as an integrated whole. Verification has to take into account these very complex effects. That’s part of what is preventing interoperability. The big companies can do it. But it’s difficult for smaller companies trying to do the integration on their own. It will take EDA companies supplying the tools to popularize those kinds of techniques. You need chiplets when you have multiple different technologies that you want to integrate together. You could do it on one substrate, but that’s not the best approach. The best way is to combine them together. When you want high performance like NVIDIA, for example, they use full reticle, which is the biggest size you can produce. But that’s not enough for them to get the performance they need. They also need the memory on the side and silicon interposers. They need to use all the tricks, including 3D packaging and the full reticle. So whether you’re going for extreme high performance, like NVIDIA, or you’re going for integration — like people used to talk about SoCs, but now it’s happening more with chiplets like at Marvell — both kinds of uses are going to prevail. 3D is happening everywhere. It’s inside the chip, too. If you look inside of NAND, it’s 128 stories high. It looks like a skyscraper. So 3D integration is becoming important inside, too. And if you look at the transistors, they’re also becoming 3D. Transistors use to be a p channel and an n channel. Now they’re using finFETs, and they’re going to gate-all-around, which is getting taller and taller. They’re trying to pack things together as tightly as possible, but with much more sophisticated manufacturing technology. There also is talk about DRAM becoming 3D at some time in the future. For the high-performance world, the driving factor is the interconnect. A long time ago, Cray Computer pods were arranged in a circle because they wanted to minimize the interconnect length from one node to another, and to make it more uniform. The same thing is happening here, whether it’s at the nanometer scale with transistors, or at the micron, millimeter, or even meter level. Minimizing the interconnect is the key. And if you stack a chip on top of another, that’s much better than if it’s on the side. 3D is going to be the way high performance computing is going to go.

Faché: Chiplets are a really good fit for IDMs because they own the process end-to-end, from the IP to the manufacturing and the assembly. So they can make tradeoffs over that entire domain. That’s very different from a fabless design house, which doesn’t have that level of control. Accurate system-level simulation is really key for them.

Sawicki: It’s interesting talking about supply chain crisis and availability and 3D integration. At least for the next two years, having five parts that must show up at the same time to put them together on in a package is an interesting problem.

SE: Everything is shifting left, including reliability. But as we get into more safety-critical and mission-critical applications, how do we improve reliability at the far left of the chip design flow when we don’t know how these devices will be used out in the real world?

Sawicki: It’s really a multi-dimensional problem, and you can talk about a lot of detailed levels. It hits test really hard, because there have been random failure issues inside large data fabs. The ones that are showing up now are because there was a fault model that wasn’t considered as part of the test process. In addition, there are aspects of on-chip monitoring that become more important, watching this behavior and reporting back and allowing someone to do debug in their data center, rather than have to do debug in a lab after an RMA. There are aspects of safety and security that need to be done in terms of analyzing and improving those factories that support the design phase. It’s basically hitting across all the tool factors that we have in place, including physical verification. Manufacturing tools need precision in terms of OPC fidelity, which goes well beyond what’s necessary for a basic process window, because they’re concerned about all those issues that can show up. So it’s ubiquitous, and it hits almost every tool out there. There are a lot of effects that come into play.

Faché: For us, design for reliability is a big deal. Keysight not only provides design tools, we also use them to design a very broad portfolio of instruments, and reliability is really important. We have very stringent lifetime requirements. Lifetime requirements are a big deal, and you really can’t accomplish that unless it’s integral in your product development lifecycle, all the way from design to manufacturing and test. Over the years, we’ve made a lot of upgrades to our product lifecycle and vetted best practices. That also had an impact on our design tools. Our internal users, as well as our customers, deal with electrical and thermal limits. And that may sound easy at a component level. But when you think about an instrument — and some of our instruments have literally thousands of components, and their operating conditions can vary quite a bit in terms of temperature, humidity, air quality — it’s a very, very challenging problem to build for reliability. And it really starts at the IC level, and then it goes up into the system. You have to deal with the interconnects, with signal integrity, power integrity issues. We put a lot of emphasis on building these tools to help customers design for reliability.

Fujimura: Reliability is particularly important for us because we serve the manufacturing community, and some of our technology goes into the equipment. In mask shops around the world, some of the oldest tools, which they still use today, were shipped to them before Google was formed in 1998. This equipment has to last a long time. And the computing device that’s attached to it also has to last a long time. These are very top-of-mind considerations for customers. Some of the chips are incredibly reliable. The worst thing in computers is the power supply. The second worst thing is fans. But the chips are actually pretty good because of all the tools that have been developed over the years. A device may work fine when you test it, even if you do high temperature or high humidity testing. So it’s not infant mortality. It just fails over time because electrons are flowing through and it’s becoming a fuse. As people experienced these issues, they tried to understand them and asked EDA organizations for help. And EDA companies always have responded very well.

Devgan: If you’re going to send something to a particular environment and you want it to last for a long time, you want to see how it’s going behave in that environment. This goes back to verification in context. How do you get the vectors or the drivers? The best way is emulation, and that runs 1,000 times faster then you can do it on CPU. There is RTL sign-off, then there is silicon tape-out, then there is software development, and then product release. So we have state-of-the-art companies doing software bring-up before RTL is frozen on these emulation platforms. If you make sure that your design can boot software even before you freeze RTL, that really improves the reliability of your overall solution. Of course, there are the physical parts of it, too. But the verification, because it’s exponential in how you emulate the real environment on the functional side, is critical.

Segars: One thing that’s interesting is just how these issues are much more commonplace than they were. What’s driven a lot of growth in the semiconductor industry, and everything that feeds it, has been consumer electronics, which are largely replaceable. Either software is getting so complex, like in a PC, that you just need another one, or there’s a new phone and new apps. So there’s been a churn of devices, and reliability hasn’t been a really a big issue because you want to replace it before it breaks. But now you’ve got automotive consuming a ton more silicon than it used to, and that trend isn’t slowing down anytime soon. You’ve got things that are going to get deployed remotely. These IoT devices may be cheap, and they’re going to have to be low-cost and low-power, but you really don’t want to have to go and service them. You want them to run for years and years and years and years. So across the spectrum of design, reliability, safety, security — these are all issues now that seem to affect everything. Not that long ago, there were relatively niche markets where it mattered. Now it’s everywhere. And any engineer worrying about any design is going to have to think about these things in ways that they didn’t before.

Related
CEO Outlook: Chip Industry 2022 (part 1 of roundtable)
Experts at the Table: Designing for context, and geopolitical impacts on a global supply chain.
New End Markets, More Demand For Complex Chips (part 2 of above roundtable).
Growth outlook is strong for ICs, both individually and in packages, but getting costs under control is a huge challenge.
Standardizing Chiplet Interconnects
Why UCIe is so important for heterogeneous integration.



Leave a Reply


(Note: This name will be displayed publicly)