Chiplet Interconnects Add Power And Signal Integrity Issues

More choices, interactions, and tiny dimensions create huge headaches.

popularity

The flexibility and scalability offered by chiplets make them an increasingly attractive choice over planar SoCs, but the rollout of increasingly heterogeneous assemblies adds a variety of new challenges around the processing and movement of data.

Nearly all of the chiplets in use today were designed in-house by large systems companies and IDMs. Going forward, third-party chiplets will begin showing up in designs, as well, including some made by different foundries, and some pre-integrated with other chiplets into subsystems. This will significantly change the integration challenges, and it will require flexibility in the tools, assembly processes, and testing strategies, as well as the designs themselves.

“The dirty secret of chiplets is the packaging technology,” said Mick Posner, Synopsys vice president of product management for IP solutions. “We have to deploy across multiple foundries, so there are multiple solutions per foundry, and then multiple packaging technologies. In addition, there are different ecosystems. Automotive has its own ecosystems, and they’re telling us, ‘You shall use this technology.’ The bottom line is that metrics are the key.”

There are a number of options available for connecting chiplets to each other, and to various memories, and complex architectures for moving data. While many of these approaches are silicon-proven individually, they also require an enormous amount of engineering to everything works together as expected.

“Connecting chiplets to chiplets is really not that much different than connecting components together on a printed circuit board or die on an organic substrate,” said Brad Griffin, a group director of product management at Cadence. “It’s just that everything’s getting smaller, and everything’s getting faster.”

It’s also becoming more diverse. Packaging options include everything from 2.5D and 3D-ICs, to a combination of both, and they can be assembled from the bottom up monolithically or combined on a larger scale with panel-level packaging. Each has benefits and limitations, and the more heterogeneous the design, the more challenging it is to sort through all the possible options and potential dependencies. This is why some vendors are forging a middle ground, at least until the chiplet world is better defined and more standards are in place for connecting them together.

For example, there is more than one way to move data across a fixed distance. In some cases, this could involve a pre-built, pre-integrated module, which is the path Arm is taking initially. In this case, the interconnects are well defined and so are the processing cores.

“These are compute subsystems,” said Christopher Rumpf, senior director for automotive at Arm. “The workload and the software is where the differentiation is happening. So we will still sell IP products, but now we are assembling them into larger subsystems. They are standardized compute platforms, which lowers the cost of porting and validation. It’s a precursor for the chiplet world.”

Similarly, Eliyan has taken some of the guesswork out of the ideal PHY. “One way to go is a higher data rate, higher baud rate,” said Ramin Farjadrad, CEO of Eliyan. “If you run an NRZ (non-return to zero), it’s still ones and zeros, but you run it faster. You also can achieve that speed with a simultaneous bi-directional approach. If you have 100 lanes, the standard approach is to use 50 of them to transmit and 50 of them to receive. So these 100 lanes will take a certain amount of beachfront on the IC. But if you use all 100 lanes to transmit, and all of them to receive, then you have 100 transmit lanes and 100 receive lanes, which doubles your bandwidth.”

Interconnect issues
Putting this in perspective, chipmakers are wrestling with a combination of factors — and those factors in different combinations. Performance always is dictated by architecture and workloads, but in advanced packages even tiny differences in the speed at which data is processed and moved can cause disruptions. Those variations can be caused by noise, traffic, or uneven circuit aging. They also can be the result of disparities in tolerances due to chiplets developed using different process technologies. For example, tolerances at 5nm are significantly tighter than those at 90nm.

But regardless of the process node or the packaging type, data needs to remain intact. The thinner the wires, the higher the resistance and capacitance. That impacts the amount of power required to drive signals through those wires, and the amount of heat generated in doing so. There are more transistors to process data, more data to process, and more movement of that data between processors, memories, and I/Os. However, signals still need to arrive at their destination at exactly the right moment to maximize performance.

This is well understood by companies working with chiplets today, and fortunately there is no shortage of solutions. The real problem is understanding the myriad tradeoffs and best combinations for a particular workload or application. A key factor to be considered in all of this is the distance signals need to travel. This requires both partitioning and prioritization of functions, and it affects the overall architecture of any system-in-package. Some wires can be fatter to reduce RC delay, and some signals can be prioritized using the z axis, where shorter interconnects can improve performance, reduce power and area, and minimize the amount of heat that needs to be dissipated.

“Your reach typically is limited to about three millimeters, but even then you have significant cross-talk Issues because you’re talking 45nm and less spacing,” Griffin said. “Then you have manufacturability issues, such as physical broken links or degraded links. And then you’ve got general thermal issues.”

Still, the push for smaller and faster never ends. While there’s no Moore’s Law for chiplet interconnects, they must handle increasing volumes of data passing between the modular dies with the same integrity as chips developed at mature nodes. The problem is this conflicts with the need for more CPU and GPU chiplets, which collectively require more energy, generate more heat, and need increasingly complex thermal management schemes.

More interactions
The number of possible interactions with data movement continues to increase as more features are added into packages. The buildup of electrostatic charges at the interconnect level is one such example.

“What we’re seeing when we put these chiplets together is signal loss, misalignment and having an interconnect, a connection that are inadequate for the task it was designed for,” said Matthew Hogan, product management director at Siemens EDA. “This is particularly true when we have a look at electrostatic discharge susceptibility. When you start putting these chiplets together, you need to make sure that the interposer or the interconnect fabric that you’re using has sufficient interconnect robustness. It must, first of all, get signals through without having too much signal loss, which means it’s low resistance. You also need to make sure it has sufficiently low resistance to protect the system from electrostatic discharge events. We have to make sure they have good design methodologies to preserve the signal integrity and power integrity of that whole system, keeping in mind that each of these dies can be a vastly different process.”

While advanced fabrication techniques have allowed ICs to continue following Moore’s Law, there is no such law for interconnects. “These interconnects between chiplets are pretty short compared to PCB standards, but from a monolithic point of view, they’re still quite long,” said Marc Swinnen, director of product marketing at Ansys. “If you want to have a high-speed connection over this long wire, that makes it a transmission line, and so it’s well understood that all these wires, these chip-to-chip connections on a chip, on an interposer, have to be electromagnetically simulated, so you can’t just do a good old-fashioned RC. On-chip, you’re fine with an RC extraction. It’s accurate enough because the wires are so short. But once you get from chiplet to chiplet, the wires are quite long. They can be inches long, that makes them transmission lines at those frequencies. And so you need to do a full resistance, capacitance, and inductance extraction to model those wires properly.”

As the interconnects get smaller, other problems crop up, as well. Assemblies need to align on every level.

“It has to align in three dimensions,” Swinnen said. “You can rotate them versus each other. They can be tilted. There are all sorts of solutions, but you have to get very, very careful alignment at these resolutions to get those things assigned. That’s one of the reasons you see the founders getting involved more and more and more, because they’re familiar with these very, very fine tolerances.”

Power presents a unique challenge
While maintaining the integrity of and speed of data flowing between chiplets is complex, arguably the bigger challenge is dynamic power density.

“Certain companies already are doing some level of 3D, and we know, of course, HBM is a 3D technology where it’s stacked,” said Synopsys’ Posner. “When I talk about 3D prototypes, I’m talking logic-to-logic stacking, because that is where power density becomes a significant issue. The challenge is not the power consumption of just that top die. Your overall power profile is the consumption of the bottom die plus the top die. You’re sending that power up through hybrid bonds or through-silicon vias. Power density at the moment is a huge challenge because designs haven’t taken into account that they may need to spread the power and grounds out.”

“A power supply somewhere that’s bringing us that power may go through various different voltage converters,” said Cadence’s Griffin. “Let’s say it starts out at a very high voltage, and it goes down to 1.1 volts of power to the chiplet. We have to make sure that’s all done very carefully. That’s now a system-level problem. With a PCB, the power came from the voltage source, and then it’s just delivered to the board. But as soon as you go through a board, a package, an interposer, all the way to a chiplet, you’ve got all these different places for discontinuities that can impact the overall power integrity.”

The challenge is verifying that the power throughout the chiplet interconnect is stable and efficient, Griffin said. But that requires a system-level check of power integrity.

Standards come up short
While individual companies are working on more efficient interconnects, the long-term goal is a standard that can enable the development of a plug-and-play chiplet economy. The most recent UCIe specification, released in August, offers support for 3D packaging that enhances both power efficiency and bandwidth density.

“There are a few other standards about how to interconnect these chips, which will be important if you ever want to make a chiplet economy,” Ansys’ Swinnen said. “You’re going to have to agree on how these chips are going to communicate with each other at the physical level. UCIe has a low-speed and a high-speed version of the spec, and the high-speed version allows wires to be only two or three millimeters long.”

While the UCIe does standardize some electrical and physical links, Posner observed that in some regards, it “doesn’t go far enough.” The UCIe is lacking in “protocols across multi-die, and then just pure function. The whole Jelly Bean approach breaks down when you can’t manufacture anything that would have that level of scalability.”

New solutions
The rise of chiplets has necessitated the creation of new tools that are capable of tackling the challenges of heterogeneous packaging.

“When we start connecting chiplets together on silicon, some of the tools that we have that allow us to do things faster on boards and organic substrate packages, some of those assumptions may not apply,” said Griffin. “We face a situation where we need to use tools that don’t have all those assumptions built into them. What that means is we’re having to stress the simulation resources, because instead of using a 2D solver or a hybrid solver, we’re having to use a full-wave 3D field solver. You’ll get good, accurate results because that type of solver doesn’t make many, if any, assumptions.”

The tradeoff, however, is the process takes much longer. “If you’re doing a chiplet, you’re doing the 3D-IC with dozens of chiplets being connected together, and you’ve got multiple UCIe interfaces connecting together, there’s a lot to simulate,” he said. “And you can’t necessarily afford the time that you would have to use to be able to do that 3D, full-wave simulation,” he said.

Adding such complications as degassing holes impact a chiplet’s overall electromagnetics, and the process can require even more time. EDA companies currently are working on implementing silicon interposers into implementation tools, which allow 3D solvers to be run on small pieces of data. That shortens time to results, as well as the time needed for compliance testing.

There also are a number of powerful electromagnetic simulators and modelers on the market, but those come with some caveats. “The key to electromagnetic analysis is the meshing,” Swinnen said. “All these tools use a finite element analysis, and so you have to build a mesh, and it’s always a tradeoff. The finer mesh is accurate but slow. A coarse mesh is fast, but it makes compromises on accuracy. You need adaptive meshing and variable meshing. In a lot of the technology, the secret sauce is actually in the meshing and how this is put together.”

Moreover, these solutions cannot be implemented in a vacuum. Changes made to the interconnect are “not transparent, are not invisible to the design,” Posner said. “It’s a conscious partition of the design. That’s the unicorn — a multi-die design where the links between the dies are just transparent to the designer. But in today’s technology and what’s manufacturable, there has to be a conscious design going into mapping your design across multi-die.”

Conclusion
Chiplets are quickly becoming an integral part of the high-performance compute world, thanks to their versatility and scalability, but utilizing them comes with some unique challenges involving signal integrity, power integrity, and integration.

While such standards as UCIe are already in place, in some ways they fail to go far enough in dictating functionality and protocols across numerous dies. Alongside of that, the complexity of chiplet interconnects makes verification and compliance testing a lengthy process.

New tools already have begun to spring up, which allow designers to quickly isolate and test discrete blocks of data. That can save enormous amounts of time. However, working on interconnects cannot be done in a vacuum, and their design must be done while keeping in mind the requirements of the larger system in which they’ll function.

— Ed Sperling contributed to this report.

Related Reading
Chiplets Make Progress Using Interconnects As Glue
Industry learning expands as more SoCs are disaggregated at leading edge, opening door to more third-party chiplets.
One Chip Vs. Many Chiplets
Challenges and options vary widely depending on markets, workloads, and economics.



Leave a Reply


(Note: This name will be displayed publicly)