Technical and business challenges persist, but momentum is building.
Chiplets are all the rage today, and for good reason. With the various ways to design a semiconductor-based system today, IP reuse via chiplets appears to be an effective and feasible solution, and a potentially low-cost alternative to shrinking everything to the latest process node.
To enable faster time to market, common IP or technology that already has been silicon-proven can be utilized. Time and resources needed to redesign IP can be saved by integrating existing tested technologies and reusing common IP across products. The chiplet’s process technology can be matched to tested nodes for mature IP or developed on more cutting-edge advanced nodes for newer IP.
“The primary chiplet is a basic subset function and is the common denominator of the overall design application,” said Sue Hung Fung, product line marketing manager for UCIe at Cadence. “Using the common denominator function, the primary chiplet can be re-used across different product lines targeting that specific design application as part of product segmentation. This is primarily leveraging the economies of scalability and design re-usability of the common functional block.”
Chiplets also have a tremendous impact on the entire semiconductor ecosystem, according to Simon Rance, vice president of marketing at Cliosoft, who said the multi-company collaboration taking place on system designs today will grow significantly with chiplets. “Typically, you’d have an Apple product, which has its own intellectual property that they develop in their chips. But now they are using stuff from Arm, Broadcom, and others. They’re integrating that into their big M2, M3 processors. In the past, every company did their piece, and eventually gave it to Apple. Apple then had to test it, make sure it worked as intended, then integrate it, then do the system integration. That’s getting harder as these chips get more complex. Apple has that tradition of a new iPhone every 12 months, and that release cycle that has to happen no matter what. They’re trying to speed up that code development process where they’re not waiting for each company to do their piece, and then analyze it all together later at the end of the cycle.”
This means multiple parties drive a chiplet project, not just the system architect, or the packaging architect, or the SoC architect individually. It’s now all of the above, and more.
“It used to be the system designers would give a spec to the RTL guys, then the RTL guys would write the code and maybe work with the ASIC guys to pick the IP, then the packaging guys would get involved,” said Tony Mastroianni, advanced packaging solutions director at Siemens Digital Industries Software. “It was very decoupled. Now, everything is coupled. The architects are working with the RTL architects and the package designers. When you’re designing these chiplets, there may be six or seven designs being done in parallel. You can think of them as blocks, but they’re not being connected in the top level of the chip. They’re being connected in the package, so the package guy is in the middle of that. That means they need to look at floor-planning that system-in-package. They’re making sure all the I/Os are aligned on those high-speed internal interfaces. And they have to worry about thermal coupling between the chips. That’s something that you don’t normally do in ASIC design. You just look at the package.”
Physical issues
The entire chiplet team has to worry about thermal mechanical stress, particularly with large silicon interposers, Mastroianni noted. “They must be concerned about the pitch of the micro-bumps, and look at reliability types of issues. Test is a huge issue, as well. The test folks have to work to make sure that for all the chips you have a test strategy, so they get involved very early to figure out how they’re going to test the whole system-in-package, and then the chips need to talk to each other. Die-to-die interfaces are used for the high-speed functional layout, but for test, all that test I/O needs to get connected in the package. So again, the package guys are in the middle of that.”
Thermal always has been one of the major challenges with packaging, because unlike a planar SoC where the silicon dissipates the heat, inside a package that heat can get trapped. “If you’re moving the HBM away from the processor, it’s going to be stacked, so then you’ve still got thermal problems on each of these pieces,” said Frank Ferro, senior director product management at Rambus. “If they’re not all on the same SoC, it’s even worse, because you’ve got so many watts on the HBM and so many watts on the processor, and that’s in a very tiny footprint.”
While AMD, Intel, Samsung, and Marvell have successfully deployed their own chiplets, a commercial chiplet market is more in the planning stages than reality. HBM is the closest thing to a standard chiplet today, and it’s expensive due to the cost of the 2.5D interposer so its use has been limited. “The supply chain problems didn’t help over the last two years,” Ferro said. “That’s eased a little bit, but it did highlight some of the problems when doing these complex 2.5D systems because you’ve got to get a lot of components and substrates. If any one of those pieces is not available, that disrupts the whole thing or results in long lead times. That’s given rise to methods to build an HBM chiplet or an HBM system connected through a traditional PCB or other means. If I’ve reduced the size of my processor — at least in the sense that I’ve taken some of the I/O away, but also because now I don’t have to put that big processor on a 2.5D silicon interposer — I can build that in a traditional package and then talk to the 2.5D. If you just have that I/O device with either HBM or even GDDR, or with something like PCIe, you can have different I/Os on those chiplets. We are seeing a really strong push now for that technology. And while still early, the rubber is now starting to hit the road where real solutions are starting to be proposed. UCIe is taking the flag for the moment.”
Connecting everything in a consistent way is one of the big challenges. “A PHY alone will not make the interconnect fully compatible with other chiplets with the same PHY,” said Javier DeLaCruz, senior director of system integration at Arm. “The control logic and protocol have to align, as well. The choice of protocol is likely more religious than the PHY. Steps can be taken to make die-to-die interconnects more efficient from a PPA standpoint, particularly with latency, but these tradeoffs reduce the ability to extend the use of these chiplets to other products.”
For example, two UCIe interfaces may use the same PHY, but if one uses a PCIe protocol and the other uses a streaming protocol with a direct to CHI interface, then there is a mismatch that needs to be addressed.
“There is hope in the chiplet reuse model,” DeLaCruz said. “In most use cases, there will be one chiplet with the product’s secret sauce that would be custom designed. This custom chiplet can match the PHY from the pre-existing chiplet and/or employ protocol translation.”
Others say chiplets will not be quite so straightforward. “From a technology point of view, re-using IP in the form of pre-made chiplets is far more difficult/risky, noted John Ferguson, director of product management at Siemens Digital Industries Software. “In an SoC, you have a single process and a corresponding PDK. Things like thermal impacts and mechanical stresses are fairly mitigated as everything is on a single silicon substrate. With heterogeneous chiplets combined in a package, you now have a vertical degree of freedom, corresponding to new stresses from stacked dies or other materials, and greater challenges of heat dissipation through multiple layers of different materials. These affect electrical behavior and ultimately can impact yield or reliability. Bottom line, just because a chiplet works in one package configuration, does not guarantee it will behave the same in another. 3D-IC verification will require continuous levels of analysis at different levels of the design viability to ensure it will behave as desired.”
This is much more complex than what is needed at the SoC level, and also translates into a business issue in terms of investing in new tools, training users, the time required to do all these iterations, and the risk of failed designs, Ferguson said.
Decompose, compose
The challenges are significant for chiplet design today. Companies that working in that space typically have a lot of experience designing advanced ASICs.
“Since there is no ecosystem of chiplets, they need to be doing multiple chip designs concurrently because all these chips need to be designed to work together to build a whole system,” Mastroianni said. “That means the architects really have to work with the package architects, and a little bit of ASIC, but it’s really deciding how to decompose into chiplets. This is a huge exploration space. There are a lot more options to split it up. You also can build much larger systems. You are limited, even in the advanced nodes, to a certain number of gates and die size, reticle size, which is a physical limitation. But when you get into these advanced packages, that’s no longer the case.”
While 2.5D is the predominant stacking technology today, full 3D-IC is coming, bringing a whole new set of problems. “The technology is there where you can actually stack die and make them work, but it’s primarily done manually,” Mastroianni said. “What I call true 3D is where the tools will actually do all that detailed chip decomposition, making sure power is delivered through all the different layers and closing timing through multiple die. That’s probably two to three years away before it becomes mainstream.”
A large ASIC, in contrast, is broken into hierarchical blocks, and there’s reusable IP primarily for analog and high speed I/O. “With chiplets, now you’re decomposing your ‘super ASIC’ into smaller chiplets, but you’re not necessarily stuck with having to use the same process, and you can use that to your advantage,” he said. “If you have a large processor, you can use a 5nm or 3nm. If you have an analog/mixed signal, you can use a cheaper process that works better for that. And there may be specialized IP that’s only available in a very expensive node. If you only need that for one interface, why not just build that into a chiplet?”
Still, depending on the type of the IP and also the application, different types of adaptions are necessary, noted Andy Heinig, head of department for efficient electronics at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “The most important part is the connection of the IP to the die-2-die bridge. If you have a single core/accelerator or an FPGA, a direct connection via an AXI interface is possible. But here, too, it must be clear how the accelerator gets the data. If it is a row streaming interface, it is easy to transfer the data. But if a shared memory is used, the access to the memory must be defined. If there is a multi-core system, new system concepts must be established with local memory on the accelerator card to avoid unnecessary data transfers. It’s not only a problem of the IP itself, but maybe also for the whole chiplet. It must be clear who is responsible if something isn’t working in a complex and expensive chiplet system.”
Fig. 1: Different packaging technologies. Source: IEEE/hir/Fraunhofer IIS EAS
Fig. 2: 2D/3D-IC packaging approaches. Source: Samsung/Fraunhofer IIS EAS
At the same time, chiplets raise a whole host of interesting business impacts for IP vendors, much more so than technical impacts, noted Steve Roddy, vice president of marketing at Quadric. “On the technical side, for pure digital logic IP (such as all processor IP), chiplets won’t have any impact to the design itself — i.e., processor core logic connects to on-chip interconnect — and whether that interconnect bridges to other on-chip elements or off-chip thru I/Os of some flavor makes no difference to what the processor vendor delivers.”
Business challenges
On the other hand, the business side will be quite interesting, Roddy said. “The predominant pricing model in processor IP is license fee plus running royalty, where the royalty is most often calculated as a percentage of the average selling price of the semiconductor product. Back in the day when 95% of the business for IP companies was selling to semiconductor vendors who sold single packaged die to system OEMs, the percentage ASP model was well understood and well accepted. But what if what previously was a $20 packaged, tested final piece price with a single die suddenly becomes a $20 multi-die module on a substrate? Where does that percentage of ASP get calculated? Does the calculation happen on the digital chiplet price or the final assembled multi-chiplet package part? What about situations where the licensee sells only the chiplet and someone else integrates multiple chiplets into the final assembly?”
A further wrinkle is that the IP business worked, historically, because the IP block is one of many such pieces of IP assembled into a larger chip, Roddy said. “Processor Vendor X and Silicon Company Y were able to agree on, for instance, a 3% ASP royalty largely because the IP core occupied only 3% or 5% or 10% of the die area of the final chip. If chiplets suddenly change the equation to where a chiplet consists solely of a single IP, and for instance a chip assembler builds a module from one or more CPU chiplets plus a DSP chiplet plus a GPNPU chiplet plus a wireless baseband chiplet, then the entire value of a specific chiplet is mostly the realization of that single IP block into silicon. IP vendors won’t sit idle and accept 3% of the revenue from a chiplet that contains only that one vendor’s IP. The IP vendor will have tremendous incentive to transition into a chiplet supplier directly, not an IP supplier. The design content and engineering effort are almost the same, but the delivery vehicle changing. The more that chiplet interconnect and assembly technology standardizes across the industry, the faster that transition could occur.”
This also means companies that used to get away with just delivering code now have to actually deliver silicon, observed Rich Goldman, director at Ansys. “They’ve got to understand all the tape-out issues, and that it’s actually got to work. A second consideration is they no longer have to rely on royalties and extracting royalties from the users and from the foundries. They now can sell actual silicon. Royalties are an age-old problem, as old as IP itself, and one that has never really been solved except by Arm — but only because Arm had such a central position. From previous experience, royalties were something that we tried to stay away from because it sounds like a great thing, and you can negotiate royalties, but as soon as you’re very successful in quantities, your customer will renegotiate those royalties down so the payoff is never really there, with the exception of Arm. With actual silicon, you get to participate in the volume part of the sales.”
Mastroianni maintains that in order for chiplets to become mainstream, there will need to be reusable chiplets. “It’s coming, but it’s a very different business model. The likely candidates who are going to do this are IP providers. It could be the PHY guys, because they can build high-speed SerDes chiplets. Or it could be processor type companies. In any case, it’s a pretty big transition because they’re licensing IP right now for integration within a package. The difference here is, there’s a couple of models that they can take. One is they can do a test chip. Several IP companies have to do test chips anyhow, so they can design a general-purpose chiplet with their IP. They’ll do a test chip, and they could just license that. But then the customer would have to go off and basically do the fab, do the testing, etc. Another model would be to get into the silicon business. In that case, they would assume all that operations type stuff, from package and testing to manufacturing, and distribution of known good die. If they take that route, that would be more of a traditional piece part type, and there could be a license on top of that as well. The business model still remains to be seen.”
Mixel is one of those IP companies that uses test chips. Ashraf Takla, CEO of Mixel, looks at the chiplet opportunity as either an IP kind of approach, or one in which its test chips are converted into chiplets. “Our test chips, like the C/D-PHY Combo, are being acquired by many companies in the hundreds or thousands, because test equipment companies, for example, need to test C-PHY and don’t have a FPGA solution or some way to implement that on the tester side. They buy our chips for that purpose, so we’re already selling test chips in low volume. If it makes sense, we will make it into a chiplet.
Takla said that it may not make sense to have many different chiplets similar to each other. So if it’s a transmitter CSI, that could be one chiplet. “These chiplets also can be programmable. For instance, how many lanes are needed? Is it CSI or DSI? A lot of things can be programmed, just like it’s programmed in the IP, and maybe you can do even more on top of that. It’s not very different from the IP business. Everybody’s trying to re-use and do things in a way that you don’t have to customize it. For the chiplet business, it’s going to be more so.”
Big changes ahead
Mick Posner, product line senior group director for IP at Synopsys, said in the short term, chiplets need enough volume to be able to address the complexity they bring, whether that involves tools or interfaces such as UCIe/XSR (extra short reach). “In the short term, everyone wants to do chiplets. There is a new protocol. Can you remember the last inflection point of an introduction of a significant new protocol that’s applicable across all market segments? No, and that’s why you see 100,000 different customers trying to get into this. We’re more than doubling down because it’s an interface that could be on every chip. Long term, there are going to be some challenges because it’s the ultimate of reuse, along with the business model. Typical IP sales are based on a per-project basis. With chiplets, what do you now define as a project? If that die is used across 10 different chips, is that 10 uses? Is that one use? And customers know this very well. They see it as ‘I’ve just produced a piece of silicon. You never put any kind of restrictions on my silicon before, outside of royalty models.’ But when it comes down to the license usage, that potentially could have an impact on the whole industry.”
Posner believes “chiplet” is an overloaded term. “It has a connotation that it’s this tiny little chip. But I can guarantee, from the 100+ opportunities that we’re tracking, a chiplet is a 50 x 50mm² die. It’s just that there are now two or four of them in a package. Chiplets are fantastic for mixing and matching your nodes. It’s the whole notion that you can mix and match different die.”
That said, is it possible to combine chips from different foundries into a chiplet?
Siemens’ Mastroianni said this is a challenge. “You have to worry about standards and making sure you get the all the right voltages. Even if it’s from the same foundry you have to worry, because the chips will be coming from different lots. By definition, they’re two different chips, so you have different corners. If they’re coming from different processes, that even makes it a little more challenging, but it still needs to be dealt with. A lot of that is handled through the die-to-die interfaces, almost like a SerDes interface, which kind of decouples it. Those interfaces are designed to be to be like that. It’s more of an issue on those other signals. For instance, the low speed I/O may need some connection, so you have to worry about it. But typically, those interfaces are not as critical. And the high speed I/O are covered pretty much through those standard protocols.”
Flexible partitioned functionality can help make chiplet architectures more modular and scalable. “At the same time, chiplets make it possible to optimize some design processes,” Cadence’s Hung Fung explained. “Choices in process nodes, foundries, a wide range of package types, and IP types, etc. are made possible through chiplet flexibility. For portable and modular chiplet systems, die can be re-used and can also interoperate across the various process nodes in a die-to-die fashion. Digital logic circuits are far better at scaling down in geometry than analog circuits, RF, and memory. A cost-effective way to achieve a reduced die size and higher yield is by dividing the analog/digital domain into separate areas, re-using the analog, and scaling down the digital.”
Then, when there is a need for variation in the common denominator function, IP reusability is relevant.
“The port speeds, for instance, could be 100G, 200G, or 400G. However, scaling of the same chiplet would reuse the 100G IP and scale it to 800G, which wasn’t accessible in the earlier chiplet design, when a future-looking need for 800G comes into play,” she said. “The reuse of IP is an important factor in chiplet design here. For example, there can be a case that it is necessary to redesign the chiplet to handle an 800G bandwidth. The area, power, bump pitch, etc. may all have a possible impact to incorporate the new bandwidth updates needed. However, the basic functionality of the IP and the circuitry may be re-used to achieve a higher bandwidth.”
Conclusion
Monolithic die are being decomposed into functional blocks, and those functional blocks are hardened. The challenges will be putting systems together in packages using those components in a standardized way, and then re-using those components in other systems as needed. This has been talked about for years, but the industry appears to be finally on board with this concept.
Leave a Reply