Preparing For 3D-ICs

Why disaggregation of 2D chips is so complicated, and what’s missing from the tool chain to make it easier for design teams.

popularity

Experts at the Table: Semiconductor Engineering sat down to discuss the changes in design tools and methodologies needed for 3D-ICs, with Sooyong Kim, director and product specialist for 3D-IC at Ansys; Kenneth Larsen, product marketing director at Synopsys; Tony Mastroianni, advanced packaging solutions director at Siemens EDA; and Vinay Patwardhan, product management group director at Cadence. What follows are excerpts of that conversation. To view part one of this discussion, click here. Part two is here.

SE: How does 3D-IC change various processes in the design flow?

Kim: We do need to model properly for different stages. In addition, this needs to be applicable for any designers, not just chip designers. It needs to work for off-chip designers, as well. A key aspect of making that happen is to use an open, standard format. That won’t work for everyone, but it will cover the majority of customers. With vertical stacking, there is never just one vendor involved. Even for optimizing simulation, there will need to be partnerships. The designers have their own flows, and they’re not going to suddenly change those flows because they’re developing a 3D-IC. So we have to solve practical problems first. There needs to be analysis before the optimization stage, such as early power estimation, thermal estimation, and stress estimation. We want to be able to predict that a design will be successful. One of the big problems is resource constraints, because a 3D-IC requires additional work and the design team’s budget is not going to increase significantly. They don’t have time to learn new tools from scratch. And if they’re doing something like thermal simulation, where you have to consider airflow and computational fluid dynamics, they need to be able to run that outside. We also need to make sure this is automated as much as possible. This still needs to be a user-friendly environment, because you need to be able to do analysis for many different types of designs at the same time.

SE: There are a number of physical effects that need to be considered — cross-talk, EMI, thermal, and uneven aging, to name a few. How do we capture that throughout the design process, because these issues can vary significantly from one design to the next, and even by use cases for the same design?

Larsen: You need to do early thermal analysis for planning, which may include TSVs (through-silicon vias) to help with the thermal challenges. You need to do this when you’re just starting your design, all the way down to cross-coupling effects when you stack die, including both thermal and timing. You also want to do extraction early in your analysis, and the analysis needs to begin much earlier than in the past. One of the major challenges for customers is they’re not necessarily used to these kinds of analyses. Lowering the barrier to entry and helping them speed up the learning curve to do this kind of analysis is really important. There are many new effects. Thermal and cross-talk are concerns when you have dies very close to each other. How do we deal with that? And how do we assimilate it? At the end of the day, you have to be able to say a design will work. There are new challenges that need to be addressed, and some of them are still open.

SE: These are the same challenges we’ve had in the past, but now they need to be dealt with concurrently and in the context of other chips or chiplets, right?

Larsen: Yes, there is cross-coupling. So now you have to think across multiple dies.

Kim: Extraction becomes more complicated. Say you have one die and an interposer. Sometimes people design it with the package. In that case, can you extract between them? That’s one problem. Now, if you have die side-by-side, or back-to-back, and you have transistors in-between, how do we take care of the coupling? It’s very challenging, and it’s not just about the thermal. It has to work with other analyses, as well. The tools are not 100% ready, but we’re making sure the most challenging parts are addressed first, including whether any workarounds are possible. Complexity has increased a lot. A few years ago, there were about 10,000 interconnects. Now there are millions, and in a few years it will be more than a billion. And with coupling, if you have a TSV that is 200 microns away in a 100mm x 100mm chip, there are a lot of interfaces and a couple bonds in between. The errors may be between those TSVs. We’re trying to make sure at least the most challenging ones are addressed first.

Patwardhan: There are two styles of designs. One is bottom-up, the other is top-down. For bottom-up, they have a die that’s already in mass production and they’re trying to mount another one on top of it. So you’re using the same libraries and it’s working silicon, and you’re adding another die on top of it. That’s bottom-up design. But there are some customers who have a very large die — maybe 700mm² — and they want to see if they can split it up and do a stacked die for the same technology node. For that, they will need to look at true 3D placement, including what goes on the top die and what goes on the bottom die, timing the paths going from top to bottom, and looking at the switching activity and where the cells are placed on each die so you don’t have switching parts right on top of each other. That way you can avoid heat dissipation. For that kind of design, if your placement algorithm is looking at all the cells in the library, maybe there are cells that are characterized to be used in stacked designs. It’s still early for that. When you’re doing the building blocks itself of those two dies, maybe you can have some kind of definition in the libraries that this block will be used in a stacked design. That way it can be modeled and characterized accordingly, and the timer will know to take the margins while doing delay calculations. ‘When those cells are there, they’re in a 3D stack going through a hybrid bond to lower tier.’ There is a lot we can do in defining the components of the whole 3D system, providing you know for sure this is going to be built into a 3D stack. EDA algorithms can make use of that information to define the full space, and then do full 3D optimization across dies. As you start defining what designs are going to be 3D, it will be easier to define the flow and address the challenges in the algorithms from an EDA point of view.


Fig. 1: Understanding all of the possible connections and interactions, and the optimal floor plan for them, is a massive challenge. Source: Cadence

Mastroianni: When you’re starting out with a new design, there are a lot of decisions to make. Do I go 2D or 3D? Do I use standardized chiplets? Do I go fan-out? But once you make those decisions, if you’ve made the wrong decision, you can optimize yourself to death if you chose the wrong micro-architecture. It’s important, even with shift left, to have high-level predictive models to steer you in the right direction and figure out how you’re going to build your system. Then, once you have that, you want to continue with the refinement of those models throughout the entire design process. Another consideration with 3D is that it’s predominantly a silicon place-and-route problem. Eventually, to deal with thermal, you need to understand the airflow or whether you’re going to use fluids. So you have to understand the context of where this system is going to be. That needs to be part of the overall design process. And in some cases, if you’re using fan-out, part of your solution is not silicon. It’s going to be integrated into an organic material. You have to be able to deal with that in terms of extraction and timing analysis. The early predictive modeling — and then continuing refining those models throughout the entire process — will be critical for this to work.

SE: With multiple chips, you have less capability to handle variation, and any margin you build in to deal with variation and other effects can generate heat. What’s the solution?

Patwardhan: Some of these early test chips, and the silicon data that comes out of them, will be extremely important. Right now we are basing it off of how we have done 2D chips, and just adding margin or mathematical modeling or optimizations on top of them. But as the first field test chips come out with full stacks, we need to know what kinds of bump pitches were used, and how far off the thermal model was from the actual heat dissipation. There may be extra margins that need to be defined. That is going to be really important for tuning the tools, and we will need to get that data from the foundries.

Larsen: I agree with that. And with existing designs, when you build something on top of them, you have a tremendous amount of learning already. When you take all the data you know about your design and disaggregate it into multiple dies, that probably will be the way the industry gets into 3D. It will leverage all the knowledge from monolithic dies, and then when you split it up, you use that as a starting point to do the optimization and tuning to make sure the new system in a smaller footprint can still generate the performance you need. A loop-back from the data and learnings you did in the 2D designs is important for when you disaggregate that die. Today that’s done manually. There is no automation to bring all that production data back into the design process.

Kim: We work with foundries and customers to get those various tools and simulations, and to make sure that is up to speed so the customers can gather that information, as well. On the manufacturing side for 3D-IC, there’s not a functional model, so there are some challenges there. Working with the foundries and manufacturers is extremely important, starting with measuring values like thermal. That needs to be measured and confirmed with the foundry, and the customers have to believe those are the correct values. They need to have confidence in the process and the ability to ensure measurements and equations for the solvers being used are correct because margins are so tight. They need to make sure the equation is correct.

Mastroianni: Another piece of this is diagnostics and tests. What if something goes wrong? How do you debug it? When you have multiple chips stacked on top of each other, you need to make sure you have the observability and testability to go in and figure out what’s happening. Is it ‘this’ or ‘that’ chip? And what happens when you have different margins? You need to have those test structures in place, and maybe access to PVT sensors inside each of the die so you can figure out what’s going on. You may even want to include repair strategies. During the assembly process you have lots of connections that you’re making, and in some cases redundancy. HBM has some redundancy built in, so if things go wrong in the manufacturing process you don’t have to throw out the chip. That’s something else that needs to be part of the overall design process. It’s important for bring-up for manufacturing test, and also debug, and all of that has to be considered in the overall process.

Related
Challenges With Stacking Memory On Logic
Gaps in tools, more people involved, and increased customization complicate the 3D-IC design process.
Setting Ground Rules For 3D-IC Designs
The few designs to reach silicon today are completely customized, with inconsistent tool support. That has to change for this packaging approach to succeed



1 comments

John R. Thome says:

If one goes for internal interlayer cooling in 3D-ICs, I led a large Swiss sponsored project named CMOSAIC in which we did liquid cooling and two-phase cooling, including a simulator that handles non-uniform heat dissipation, local heat transfer and pressure drop by individual microchannels incrementally along their length in all the layers…3D cooling. We later showed in a follow-on project that backside two-phase cooling could do the job when the layers in the stack are thin (< 100 microns), i.e. an easier cooling solution to implement physically. Bringing cooling microchannels close to the heat sources greatly reduced thermal gradients and allowed high local heat fluxes up to 100W/cm2. I like to think of 3D-ICs as a rack-in-a-chip solution.

Leave a Reply


(Note: This name will be displayed publicly)