Hot Stuff

Thermal modeling takes center stage as designs move to advanced process geometries; roadblocks exist, but so do possible solutions.

popularity

By Ann Steffora Mutschler
When it comes to thermal modeling, which has been practiced for many years, the challenges are daunting. The good news is that approaches are emerging as challenges increased with smaller process nodes and design complexity.

Viewed from a number of viewpoints—transistor, chip, package, board and system—thermal models traditionally have been created from more of a system-level perspective to look at airflow through the chassis of a computer, for instance, or airflow and cooling of a rackmounted blade server. The whole goal was to keep the junction temperature at a certain level so the chips didn’t get overheated to the point of failure.

From that system-level perspective, there are a series of single-value metrics, which were really the industry’s first stab at trying to characterize packages for thermal simulation. “Those are numbers like theta-JA, theta-JMA, theta-JC, and theta-JB,” said Byron Blackmore, product manager at Mentor Graphics. “These single-value metrics are useful in terms of providing a means to compare package A to package B, but in general they are not usable metrics from a thermal simulation perspective. They are extremely dependent on the environment in which the measurement was taken.”

One step up from the single-value metric is the two-resistor compact model, which is meant to reduce the complexity. The two-resistor model takes two of those single-value metrics, theta-JC and theta-JB, and combines them into a network of thermal resistances where the thermal engineer can specify what the power will be at the junction node and let the resistances to the case and to the board be included in the thermal analysis. That’s a big improvement over using one value, but it still has some shortcomings.

If two thermal resistances are being used, it constrains the heat from the junction to move in one of two directions. In a real 3D package, the heat certainly can move laterally within the package and ultimately into the environment, so it misses some of the physics. Still, a well-formed two-resistor compact model will be able to predict the junction temperature to within 20% of what you would get for a detailed representation for that package, said Blackmore. This approach was formalized into JEDEC standard 15-3.

As with many types of simulation, giving the tool too much data to crunch prolongs or, in some cases, prevents the simulation tool from completing its task. Determining how many objects to give the tool is key.

“It really comes down to the experience and judgment of the engineer who is running the simulation. There’s a very strong correlation between how many objects and how many grid cells you have in a simulation and how long you need to wait before you can inspect the results. In general, the fewer objects you have, the faster the simulation turnaround time will be. At the conceptual design stage where you need to evaluate tens or maybe hundreds of different design alternatives, the turnaround time is paramount. It needs to be fast to make a decision on each one of these potential designs. As it evolves, you can afford to spend more time on your simulation because you have fewer of them to run,” he said.

To deal with this, Mentor developed means over the years to appropriately simplify various pieces of geometry, such as enclosures, venting, PCBs— all common objects, and established accessible working ways to model these within its simulation tools—even to the point of automating that for the user.

Zeroing in
At the transistor level, compact thermal models have been around for some time, just as at the system level. “They were not used for MOSFETs,” said Hany Elhak, senior product marketing manager at Synopsys. They were used for some applications where thermal effects are important such as high power transistors and also in SOI (silicon on insulator). Because of the insulator in this process, thermal conductivity is bad so the transistor cannot really get rid of the heat, and that had to be taken into account. The standard compact transistor models like BSIM4, which is used to model MOSFETs, did not include thermal effects. Today of course it’s a different story because we have 16nm coming with the three-dimensional transistors.”

He pointed out that the finFET structure doesn’t allow for the heat to dissipate easily, so starting from 16nm thermal effects are becoming very important. A new compact model has been created for finFETs called BSIM CMG (common multiple gates that takes into account thermal effects and self-heating, which is how the power of the transistor can affect its temperature.

New headaches
However, the problems are increasing. “What used to be a problem for the processor alone is now becoming a problem for mobile devices,” explained Aveek Sarkar, vice president of product engineering and support at Apache Design. “The problem is that mobile devices are becoming our computers. They are running at 3GHz and they have multiple cores in them. There are quad-core chips for smart phones or tablets. There are higher performance devices that are running at a higher speed and they have multiple cores. On top of that they have graphics, which is typically more compute-intensive. Then these are getting fabricated in 28nm and 20nm, which are much more thermally sensitive process technology nodes. Temperature affects the resistance of the wire and, more importantly, it ends up affecting the electromigration of the wire, as well.”

The challenge is how to manage some of these different effects.

“Once we start to look at these effects, then the thermal modeling becomes a little different. Do we look at the system level or do we look at the chip level? When we talk about the system level then obviously we talk about some of these models that let you take certain compact models of the system and plug those into the next higher level. But these don’t help you comprehend some of the challenges that you have with the IC, the thermal analysis or the impact on the IC, because they are focusing on the chassis or the rackmounted server and they are not really focusing on the chip itself,” he said.

There are relatively few players in this niche market who serve a very small, highly expert set of people. Those customers use expert-based tools but want the ability to co-simulate with functional and power verification and power modeling. The model size and complexity prohibits them from doing this, or the license cost for a seat is too high, observed Gene Matter, senior applications manager at Docea Power.

“For most design houses, they can’t afford the seat cost,” Matter said. “Not only are the seats expensive but the token consumption and token lockout of using that tool is really prohibitive. Compact thermal models for us basically maintain the fidelity and accuracy of the design, but produce the ability to solve the thermal interaction or thermal behavior as a function of power in a shorter duration of time. The compact thermal model is also a model that can be derived very quickly from an existing source. It can import data from existing, higher-fidelity, more complex models, and it has an export function so it can be exported to other functional and power simulators such that that interaction, that feedback can occur,” he added.

How the market develops will be interesting to watch as 3D-IC by itself has caused lots of interest in thermal, noted Mentor Graphics’ Blackmore. “While simulation tools have the capability to do thermal analysis, where I see the largest barrier at this point is simply how to build these models in the simulation tools…because an explicit representation is just not going to be practical. There is a lot of interest in the industry now about how we can effectively reduce the amount of geometry that we include in the analysis, and there’s a lot of interest about how to link to EDA to automate and facilitate that model development process.”

He said it is unlikely standard will evolve to handle thermal modeling in a compact way at this level of analysis.

Along those lines, Apache’s Sarkar noted that system-level thermal analysis tools consider the whole chip to be the same temperature, which is where he believes evolution will happen as engineering teams start looking at things in more detail.

“As the industry moves more towards 20nm and the process drives more thermal sensitivity in designs, as the chip sizes—even for mobile processors or mobile tablets— become bigger and bigger, the temperature gradient from one part of the chip to another part of the chip will be more important to model. These are some of the things that will drive standards,” he concluded.