Experts at the Table: What are the limitations today that are preventing 3D-ICs from becoming mainstream, and which companies pushing to make it happen?
Semiconductor Engineering sat down to discuss changes required throughout the ecosystem to support three-dimensional (3D) chip design with Norman Chang, chief technologist for the Semiconductor Business Unit of ANSYS; John Park, product management director for IC packaging and cross-platform solutions at Cadence; John Ferguson, director of marketing for DRC applications at Mentor, a Siemens Business; Kevin Yee, foundry director for Samsung Semiconductor; and Bapi Vinnakota, director for silicon architecture program management at Netronome, and Open Domain-Specific Architecture (ODSA) subproject lead for Open Compute Project (OCP). What follows are excerpts of that conversation.
SE: It seems we are on the cusp of perhaps the biggest disruption in the semiconductor space, and yet a thousand minor issues must be resolved to make it practical. Where do you see the technology today, and what are the most significant issues?
Vinnakota: Within the ODSA, we are assuming that what people want from chiplets is interoperability. You want chiplets from multiple companies to interoperate. The biggest barrier to getting interoperability is the interface—how do they talk to each other? We figured out along the way that you need a network between them. There is the physical technology they use to talk to each other—the PHY layer, which could be a massively parallel bus or high-speed serial buses. Then there are the logical transactions between chiplets—do you want it to be CCIX or Tilelink? Or what kind of non-coherent protocol do you want? That is where there is a lot of delta, but in between is a lot of room for agreement. The packaging seems to have boiled down to one of two alternatives—either you are silicon-interposer-based or you are on an organic substrate. Other technologies are catching up, but fundamentally it is getting the details of interoperability right.
Park: If we are talking about true 3D integration, there are tons of gaps, depending upon how you want to design the chips in the stack. If you want to design the chips independently from two different tools and then glue them together, those tools and flows are there to verify the stacks and the dies. But now people want to do a co-design between the two dies in the stack. They want to concurrently design between the bottom chip and the top chip. The traditional packaging tools in the industry cannot do that. Traditional implementation tools work with an abstract representation of the two chips, so you are just looking at sticking two black boxes together. You cannot see the internals of either chip, so you cannot make any real, smart decisions about placement, floor-planning, routing, etc. You just don’t have that ability. Even though we call it packaging, it is different design tools. These tools are not the old PCB style of design tools. These are the tools we use for IC design. Two chips, two PDKs, two tech files, billions of instances, and then people want to do route resource sharing. If I run out of routing resources on the bottom chip, I may want to go up into the top chip and use some routing resources there, then come back down to the bottom chip. You have to have the full representation of both of chips together in that environment. For people who want to do concurrent layout of both chips in the stack, everything from place and route through timing, the actually physical layout requires you to have both chips at the full transistor level in the same layout tool, and not a lot of people are doing that right now because of the limitations that exist today. So, there are big gaps when it comes to concurrently designing the bottom chip and the top chip in the stack. I don’t think there are any challenges if we say I am going to design two chips and figure out how to glue them together—we can do that already. Closing timing between the two chips and finding route resource sharing when I want to shift this block from the bottom to the top needs some work.
Ferguson: It’s determining which pieces go onto which chip—and where on each chip, so that they don’t interfere with the other pieces of the other chip.
Park: We even have one customer that wants a dynamic, hybrid bond pad placement. So they have a netlist that ties both chips together, and they want the router to find where to place the copper-to-copper connection between the top and bottom chip. They want the connection point, the bond pad automatically inserted there. So for true 3D design there are big gaps.
Yee: I agree. The challenge is really evolving. Technology-wise, we are already doing it [true 3D design] with high bandwidth memory (HBM), and for memory we are already doing 3D. The fundamental basics are there. The challenge now is that [where] we are able to start doing [true 3D design], it is optimizing the flow. It is optimizing how we do it, be it wafer-to-wafer, wafer-to-die, die-to-die. People are trying to do different things, stack it differently. How high can I go? This changes everything with the tools from a testing point of view, from partitioning, from place-and-route, handling thermal, modeling it—it all changes now from what we are used to doing. So the fundamentals are there, but now that we have proven we can do it and people are doing it, the big gap is that everyone is coming up with new ideas about how they want to do it, which puts a bigger load on all of the tools. How do you manage that? That is where the million cuts are coming from. Before it was just memory, and people are doing that. But now I am going to stack memory higher. I am going to do logic on logic. That starts to complicate things because that is innovation. For 3D, we are going vertical instead of horizontal, and while it saves a lot, not all of the tools are in place to handle the flexibility that designers are looking for.
Chang: I concur that we are lagging a little behind in terms of design capabilities for 3D IC design, but if you look at the customers, quite a few designs are going on. I know of about 30 designs going on in 3D-IC. So people are trying different configurations of design—logic-on-logic, logic with memory close by, and many technologies such as InFO PoP, etc. We just announced certification with TSMC for system on integrated chips. However, the customers are facing challenges. They can put together the design with Cadence or Synopsys tools, but they do not have full confidence about reliability aspects. When you put together two high powered chips on top of memory, will the memory hotspot be shifted because of the SoC thermal hotspot? So, the interactions between the dies or the stack is challenging. There are many challenges in addition to thermal integrity such as electromagnetics challenges, capacitive coupling, inductive coupling between the dies. The distance between the dies is now so small that EM waves will go beyond the die boundary and penetrate into the other die in terms of the substrate, in terms of the metal, capacitive coupling and inductive coupling. This is one of the issues. People also place inductors for antenna designs so you have an antenna together with a digital SoC or mixed-signal design, and they will interact with each other in terms of electromagnetic interference. Mechanical will become an issue, too. Extremely low-K dielectrics are susceptible to mechanical stress, and when you put together many dies—some people are putting more than 20 dies on a configurations—how do you make sure that the mechanical stability of the substrate can hold together all of these dies?
Ferguson: There are a couple of other things that I see as challenges. One is testing. We have shown that we can put things together, we can build stacks and they generally can be shown to work. But for any given die, even when you are done with testing, you still have maybe 3 parts per million that will fail. If you have 20 different dies in the same system, those failure rates multiply, so the odds of failure are pretty high. There are ways to test across those, but it comes back to having the right interfaces. You have to make sure that everyone has the right protocols in place so that the testing signals can pass from the package I/O all the way to the top die and collect everything together. That is a challenge. We spent a couple of years working to put together a flow that shows that all of the pieces work, and they work together. But in the SoC world, it is world where they have multiple vendors’ solutions. They want best-in-class tools, and we are not in that space yet. Everyone has their own solution, but if someone wants to use Synopsys for this piece and Cadence for this piece—theoretically we might be able to find a way to make it work, but nobody has really proven it. And who wants to be the first to try that? It is a big risk.
SE: Presumably somebody is stepping up to be the first for everything. There was a first for HBM, and that did a lot of pipe cleaning. There will be many other firsts. Who is driving this? What kinds of applications are on the edge?
Vinnakota: We started looking at specific architectures—essentially driven by the fact that these workloads are running out of room for general-purpose CPUs. Our game in life was to make it easier to assemble custom, domain-specific architectures by combining chiplets from multiple vendors. There seems to be about four classes of workloads—one is networking in data centers. There are tunnels and protocols and encryption and compression. Then there is storage. The granddaddy of them all is inferencing and then learning, which fundamentally require two different architectures. Finally, it is image and vision processing. All of these are compute-intensive workloads, which are driving us towards array styles of architectures with a mix of memory and off-chip traffic. So, we are talking about MAC arrays. In networking you are looking at multiple lightweight threads. These are the workload drivers and you work back from that and it says you need a chip this big and how do you do that.
Yee: The hesitation is that none of us can actually talk about which companies are pushing this technology. If you look historically at what started driving this, it is applications and market segments that required a lot of memory storage. That got us to 2.5D and now moving to 3D. The industry talks right now concentrate on machine learning, AI, networking, the data centers. Those are the markets that are initially driving it. The first limitation is that they are big die. They are going bigger and bigger, and they can’t go wider anymore because the dies are getting too big. So they have to figure out how to stack, and that is why we are talking about chiplets, that is why we are talking about accelerators, that is why they are going vertical instead of horizontal. These are the people driving it right now.
Park: If we look at 3D from a packaging perspective, it is not new. We have been doing 3D and die-stacking-in-package for a decade—a system-in-package type of thing. What we are talking about here is different though. This is true wafer-to-wafer stacking, and we have people doing the memory dies. They have been doing this for a while, so there are some flows around. CMOS image sensors have been doing 3D stacks for a while. What is new is the logic on either logic or on memory. TSMC’s System-on-Integrated-Chips (SoIC) is multiple chips on a wafer. Tons of customers are kicking the tires and calling all of us to ask how to do this, how to solve that? There is no one golden approach, so everybody has ideas about which tools to use to make it all happen. We are in the early stages.
Chang: Some companies already have made announcements. Why do they need to do that? Because, for example, an FPGA is a gigantic chip and the partitioning of the chips into five to seven chips makes it more efficient in terms of area, power, performance, and maybe that increases the yield as well because with a very large chip the yield will go down. They do need to figure out hotspots and interconnection between the logic dies, especially if they do the partition of the large die, and they need to make sure the connection between the logic-to-logic or logic-to-memory is very short. Another customer has four HBMs together with a GPU. They will be putting more HBM interfaces surrounding the GPU. For AI processing, a huge requirement is for increased bandwidth between the GPU and the memory, so they have to be very close. These two are driving their own vertical.
Park: They are both really 2.5D or CoWoS, so that is not new. If you look at CoWoS, that was 10 years ago. There was a 5-year gap where people were scrambling to find tools and flows to do this. Then we had people adopting the technology. Two summers ago, there was a big explosion of CoWoS or 2.5D. Now, with the new 3D stuff, we are back to where those early adopters of 2.5D were 10 years ago. There will be a gap while people figure out tools and flows. Does it make business sense? All of these have to be dealt with.
Related Stories
New Technologies To Support 3D-ICs (part 2)
3D design and packaging is creating new demand on tools and requires a tight ecosystem and sharing across the industry.
Advanced Packaging Options Increase
But putting multiple chips into a package is still difficult and expensive.
3DIC Knowledge Center
Top Stories, blogs, white papers, and more–all on 3DIC
Leave a Reply