Too Hot To Handle

Managing thermal effects in today’s leading edge designs requires a thorough understanding of use case, and heat effects from the IC, package, board and system.

popularity

By Ann Steffora Mutschler
It used to be that a device could be designed to a thermal design power. The worst case power scenario would be imagined, and the device would be designed with that in mind. But those good old days are gone. Especially for consumer devices, how a device is going to behave with respect to time, or how people are going to use it, must be understood as completely as possible.

“We could design to a worst-case power and assume that that power was going to be on constantly,” noted John Wilson, applications engineering manager for the mechanical analysis division at Mentor Graphics. “Then if we had a thermal design that met those criteria we were going to be okay. But now you need to design to the specific application, a specific use case, and it’s not always going to be on. It’s going to be on and off, so the thermal designer has to be more involved early on and feeding back information while the board layout is taking place.”

This is a lot harder than it appears.

“Thermal looks simple, but in order to handle it you really have to have IC, package and board and sometimes even system because of how the heat is coming out,” said An-Yu Kuo, senior architect in charge of 3D EM and thermal products at Cadence. “With today’s ICs you have so much leakage current and with the higher temperatures, you have even higher leakage current. Therefore the design team has some sort of information about how the distribution of the power consumption inside a die or multiple dies. This power consumption can be location-dependent: At this corner you have more power consumption than the other corner. It can also be temperature-dependent: At 100° you have this behavior; at 90° you have another behavior. That’s a starting point from the IC side. Now if you have this information, how do you determine what will be the eventual junction temperature, i.e., the die temperature?”

He said a a package design and board design also need to be included because the heat does not stop at the package, so ambient effects like airflow, air temperature, etc., are key pieces of information.

“Heat transfer is one of the easiest physics to understand,” Kuo added. “The challenge in this is how you integrate them together.”

Gene Matter, senior applications manager at Docea Power, observed that in many customer designs there is a spectrum of thermal constraints. “In passively-cooled designs like smart phones and tablets where there is no active cooling, there is no fan, the performance and functionality of these designs are no longer really limited by the semiconductor process technology and the features they can cram into it. But they basically become thermally limited. The performance and capabilities of the device exceed the operating range for mostly ergonomic and safety limits for skin temperature, but also the device reliability limits in terms of the maximum junction temperature.”

He noted that managing thermal effects in today’s leading edge designs comes down to having calibrated thermal sensing capability both on the die for regions of hot spots on the die, or having mechanisms to modulate the performance as a function of temperature. In addition, besides putting in calibrated thermal diodes, there must be a feedback mechanism that then allows the operating point of the device to be changed—the frequency of operation and voltage. Architecture and system design try to provide this with throttling or turbocharging mechanisms to the point where the device is at a stable operating range.

“Another tricky thing about thermal is that thermal performance really has a whole different gradient—in other words, how quickly it heats up,” Matter noted. Metal heats up really, really quickly. But air takes a little more time. So you have to factor in the delay of heating and cooling in terms of these algorithms.”

This is where thermal modeling comes in.

He explained that in the ‘old days,’ engineers would try to estimate the thermal function of the power by approximating the power through measuring the current draw into a device, the voltage and its frequency. From estimation of power equations, the thermal performance could be estimated as a function of modeling the device, its material properties in terms of its ability to transfer heat, and the surrounding ambient temperature conditions. From these computational fluid dynamic equations, some approximation of temperature could be determined.

Today, to estimate the thermal function, computational fluid dynamic tools are used as the basis to created compact thermal models, which are then used by tools from Apache Design, Cadence, Mentor Graphics and Synopsys, among others to analyze the heat and to look at the power as a function of the workload, or to look at the thermal performance of that device, subsystem or system.

The fin is in
New devices such as finFETs and FD-SOI devices being released into manufacturing are thermal game changers because the way they are constructed affects the thermal performance, noted Ric Borges, senior product marketing manager at Synopsys.

“We are able to get a lot of interesting and useful information about how the design of the fabrication process will impact the ability of a transistor to dissipate the heat. For example, as we go from planar MOSFETs to finFETs, obviously now the fact that we have fins where the current is flowing (where the heat is generated) the heat is not going to have an easy a path to the substrate where it spreads out or where the heat sink is. It’s not going to have as easy a path as planar FETs so there are effects related to finFETs that are then different from planar. We can do a lot of modeling and a lot of it has already been done to try to understand that,” he explained.

There is also the material effect. “When we talk about having SOI devices, the silicon dioxide is not as good a thermal conductor as silicon,” Borges said, “so if you have the device structure where the heat is generated sitting on top of oxide, then that device is going to have more self-heating than an equivalent device that doesn’t have that oxide there.

A limiting factor
Managing thermal effects in leading edge designs is a very big deal given that thermals are a limiting factor in performance and functionality.

“The trick in all this modern design for thermal is to deliver as much performance and functionality as you can while still satisfying the quality of service,” Matter concluded. “You need to model the complex system behavior in relatively real time and you need to be able to do the architectural exploration of your thermal management and power management policies up front. It’s no longer good enough just to implement very, very good power management. It’s no longer sufficient just to do a really good job to optimize battery life or reduce energy. You also want to be able to ride right up against that thermal limit or near that thermal limit without any noticeable degradation of performance.”