Thermal Modeling Held Back By Outdated Standards

3D ICs will require new tools, new methods and a different way of thinking about physical effects.

popularity

By Ann Steffora Mutschler
As the reality of true 3D IC design nears, engineering teams are keen to manage the heat between the stacked die in order to avoid catastrophic failures. Thermal modeling tops the roster of techniques to leverage in this area.

Herve Jaouen, director of modeling and simulation in STMicroelectronics’ technology R&D organization, explained that in 3D designs thermal modeling is very important for reliability. “In case you have thermal expansion it can generate defectivity. On the other part, on the design itself, you must take into account the gradient temperature in the stack.”

ST's Jouen: Thermal expansion can cause defects.

At the same time, Ji Zheng, director of chip-package-system at Apache observed that 3D design is something that scares people into thinking about the thermal issues, especially when one chip is going to be another chip’s heat dissipation channel. “These ideas are generally confirmed by the fact that you’re going to generate a lot of heat and your heat dissipation channel is tight so what is the impact on reliability and transistors when they’re using passive interposer or in the future active interposer? It means that the heat dissipation channel also will have to act as a device.”

These things definitely require more insight into the temperature distribution–how much variation there will be and how much heat can be sustained without affecting reliability.

“We need to do thermal modeling,” said Zheng. “The reason is that if you just do a package-level analysis, you won’t be able to get into the details of the temperature surrounding, for example, the TSVs and a particular layer inside the silicon interposer. A comprehensive thermal modeling that considers the spatial distribution of the die and the consumption in combination with the thermal environment provided by the package and board is important. But that is not a traditional thermal solution space. It is a new solution space.”

Further, it is important to distinguish between the two ways in which temperature can affect product design from an EDA perspective.

“Classically, and by far the most common adoption of thermal simulation capabilities, is for predicting operating component temperature as an indication of whether there will be any reliability concerns with the product,” said Robin Bornoff, product marketing manager in Mentor Graphics’ Mechanical Analysis Division. “The driver for considering thermal down at the silicon level is the relationship between the power dissipation and various functional blocks on the silicon and their operating temperature. This coupled relationship hasn’t really been a dominant factor until the length scales have gotten down to the levels they are now. The contribution of leakage power to the overall power of the package becomes a dominant contributor, and it’s the leakage power that is temperature dependent in an exponential way.”

The need to predict the coupling between temperature and the power dissipation—and the coupling between the power dissipation and the temperature and the effect that has on timing—while achieving the required timing are the drivers for taking thermal simulation as well as coupled thermal power or electro-thermal simulation more seriously, Bornoff believes.

“From a thermal perspective, an indicator of subsequent thermal problems and a leading indicator of how hot your electronics are going to get is a measure of the power density—the watts per square inch that you have in your package. When you stack silicon on top of each other the square inches don’t change, but you suddenly double or triple your power dissipation. That will cause operating temperatures to go right up. And because of this leakage effect, power consumption goes up, as well, and there is an increased risk of reliability concerns. You’ve suddenly drawn a hell of a lot more power than you designed in the first place so it really is a double whammy,” he added.

Thermal vs. electrical modeling
From a technical perspective, thermal modeling depends on the dimensions of the system you have. Electrical modeling for the device itself depends on that for simulation. At the design level, you must take into account all the environment, ST’s Jaouen explained.

The other element that thermal modeling takes into account is physical changes in the dimensions of whatever you are modeling, whereas with electrical modeling you don’t need to do that. But today there is no direct coupling between thermal and electrical performance. There needs to be a compact model simulation in SPICE. There also need to be two networks, one for thermal and the other electrical, with coupling between those networks, he said.

Apache’s Zheng also pointed out that thermal modeling is related to electrical modeling, but has its own unique characteristics. “Thermal modeling has to be related to temperature, and of course our other models are related to temperature but not a strong dependency. To model the power we have to characterize the chip function names as an explicit function of temperature. The temperature distribution is not only related to the fact that how the chip goes to the external world, it is also related to the fact that how the internal power consumption you actually will actually have a self heating effect–you are conducting current and you can generate the power through those currents. These are different from the electrical model.”

Another difference is that when you incorporate the thermal model you probably cannot do a so-called one-shot solution because of its dependency on temperature, he said. “You have to make iterations to make sure that you can actually calculate the converged power and the temperature because they go hand-in-hand. You cannot say I know the power without knowing the temperature or vice versa–we have to know them simultaneously.”

As great as it all sounds, Zheng pointed out that we do not do explicit thermal modeling for most designs today. “We rely on a package-level thermal simulation or module-level thermal simulation to come up with ballpark temperature distribution at the module or system level. We do not look to, generally speaking, the thermal temperature distribution impact on electromigration reliability of the diel. I don’t think it is a common practice because of the lack of such solutions.”

Updated standards needed
To enable thermal modeling to become part of the mainstream 3D IC design flow, standards need to be updated, Mentor’s Bornoff noted.

While JEDEC has put a lot of effort into creating standards and methodologies by which metrics and models can be created—thermal representations of packages—a lot of these standards are still rooted in the assumption that there is a single heat source within the package itself, he said. “That was fine for the majority of packages 10 or 15 years ago–that was absolutely acceptable. But as the industry is powering ahead there is a proliferation of various packaging styles, and almost all of the cutting-edge ones involve the fact that there are numerous heat sources within the package. That can include numerous individual die, side-by-side or stacked, or a system-on-chip where it is one piece of silicon but there are distinct areas of that silicon that are being used for very, very different purposes.”

Bornoff said standardization has to catch up to provide a method by which suppliers can supply thermal data not only for package selection but also for actual thermal simulation purposes that the end-users can actually utilize. And until that standard’s inertia accelerates the end users simply have to rely on NDAs to get the data from their suppliers or just accept the fact they’ve got relatively little data and the simulations they perform will be indicative at best. They certainly cannot be quantified down to the level of predicting temperature variations on the silicon that can give input to an electro-thermal simulation for timing considerations.

“There are already basic standards that talk about ways in which you can report the thermal performance of multi-heat source packages. There are other standards going through at the moment that should facilitate the communication of data between the supplier and end-user. The future has got to be the creation of standards to support multi-heat source packages and to break this single heat source paradigm that the thermal community has been bound by the last 20 years or so,” he said.

ST’s Jaouen agreed: “You need more. You need to compute the self-heating of each device and the relationship with the neighborhood devices through a thermal network. Today it is not currently in the CAD flow. You can compute the temperature map on your circuit and after that, introduce a few of the correct models that take into account the device temperature introduced in your netlist. But it is not a flow that is usually used in CAD flows today. But it is something that is mandatory and that is asked for by designers.”