Deciding what type of high-speed interface to use when connecting chiplets.
At CadenceLIVE Americas 2020, one of the most viewed videos was by Samsung Foundry’s Kevin Yee and Cadence’s Tom Wong, titled “Let’s Talk About Chips (Chiplets), Baby…It’s All About D2D!”
They went for this title because it reminded them of the lyrics of an ’80s song…which they proceeded to sing.
Tom led off with a look at the trends in semiconductor economics that are driving the transition to chiplets:
The graph on the right shows the cost per square mm as we ride down the process node roadmap. Of course, the costs have gone up, that is nothing new. The rule of thumb used to be that the costs would go up by 15% from node to node (for the same area) but you would get twice as many transistors in that area, leaving you a cost reduction per transistor of about 35%. In recent nodes, the costs have gone up faster than that and the area scaling has gone down more slowly, so that the cost reduction per transistor is, depending on who you talk to, small or even negative (more expensive). This is the reality of Moore’s Law. This phenomenon has led chip designers to adopt a disaggregated approach to advanced SoC designs to address this issue. By moving to a multi-die architecture and using advanced 2.5D packaging, you get the benefits of a smaller die size and the corresponding benefits of better yield at these advanced-process geometries. Many refer to this as More than Moore.
The graph on the left shows how CPUs, GPUs, Ethernet switches, and so on are all getting larger. In fact, once you go to multi-core, the limitation on how many cores you get is pretty much the maximum number you can fit on a reticle. At least, it would be if yield was linear.
As I noticed myself at HOT CHIPS a year-and-a-half ago, we seem to be witnessing the end of monolithic integration for these very advanced chips. See my post HOT CHIPS: Chipletifying Designs with examples from AMD, Intel, NVIDIA, HP Enterprise, and more.
The combination of reticle limitations and yield challenges makes the chiplet approach attractive. But how do these chiplets communicate with each other? These are all high-speed chips needing high bandwidth communication. There are basically two approaches: a serial interface, and a parallel interface. The state of the art for serial interfaces is 112G USR/XSR (ultra-short reach, extra-short reach); for parallel is HBI (an offshoot of HBM) or BoW (bunch of wires). You can compare them looking at the table.
Kevin then took over to look into how you make a decision about what types of interfaces to choose. Of course, there are more details than just serial versus parallel, but that is the biggest decision. The big considerations are overall bandwidth, energy, latency, and the shoreline bandwidth (across the edge of the chiplet). Another big consideration is whether it is a closed system: are you building both die, both sides of the communication link. Or is another group, or even another company, building one of the die. In the latter case, you pretty much have to use some standardized interface, not something proprietary. This all has major implications for overall system parameters, such as what process node, what packaging, what type of interposer, and so on.
Samsung and Cadence have worked together on the 40G D2D on Samsung 5LPE process. On the left, you can see how everything is implemented. The right shows the eye diagram and a photo of the actual test-chip on its evaluation board.
Cadence IP enablement on Samsung foundry processes is broader than just 40G UltraLink D2D communications in 5nm. Cadence provides advanced memory IP and high-speed SerDes IP in various nodes.
Kevin wrapped up with a final summary:
Leave a Reply