Power Becomes Bigger Issue In Stacked Die

Modeling power isn’t fixed, and while it may be related to heat the two need to be considered separately—and together.


By Ed Sperling
Concern over getting the heat out of stacked die is well defined, even if the current raft of existing and proposed solutions ranges from ineffective to exotic and expensive. What is less well understood is how to plan for and manage power inside of stacked die.

While power and heat frequently go hand in hand—where there is heat there is almost always power dissipation—they can be very different from a design standpoint. Each can be affected by the other, and each needs to be modeled as part of a holistic design, but system power budgets may be too high to be acceptable and still low enough not to cause thermal issues. Nevertheless, the number of power issues that can result from stacking die can be far greater simply because there are more possible permutations, and so far there is little information about how to solve this.

The reasons for the dearth of knowledge in this area stem partly from the fact that some of these devices are just now being built—the best knowledge about design always comes from experience and history—and partly because power can vary greatly from one system to the next, from one user to another, and sometimes from one chip to another even within the same design. In a stacked die, all of these come into play in the same package, often with unexpected results. That makes it difficult to model power accurately enough up front, and equally difficult to deal with as the design progresses.

“We’re seeing some complex power management schemes emerging,” said Mike Gianfagna, vice president of marketing at Atrenta. “The problem is that if you have an error, you automatically generate incorrect power management circuitry. The opportunity is to enhance much more complex verification schemes to deal with this.”

He noted that many large chipmakers have their own homegrown version of power modeling, but it will take time—and standards—before there is a systematic way of dealing with it.

What needs to be addressed where
There are several distinct points where power needs to be addressed in a design. The first is at the architectural level, where modeling will be inaccurate. But it can be accurate enough to get an idea of which IP, including processor cores, to choose, which memory, various interconnect schemes, and I/O preferences. Each of those has a different effect on power, and together they have a cumulative effect.

“As you go down in the design flow you refine the power models and the software,” said Ghislain Kaiser, CEO of Docea Power. “But your accuracy depends heavily on the IP. For some IP, if you have an error of more than 20%, it will impact decisions later on. For IP that is small and not power-hungry, an error of that size may not cause any problems. But you do have to think about the global impact of the power, especially in a stacked die.”

While this is complicated enough in a planar SoC, it becomes even more complex in a stacked die because not all of the pieces are necessarily built at the same time. In addition, IP blocks and even entire subsystems can interact in unforeseen ways, sometimes decreasing power consumption as with Wide I/O, and at other times generating more power than anticipated because of unexpected proximity and other physical effects such as increased temperature.

“There are lots of things going on,” said Andrew Yang, president of Apache Design. “The voltage is fluctuating, so you’ve got on-chip voltage regulators to stabilize the power supply and back biasing to further reduce leakage power. At 20nm, reliability is becoming a key driver. Electromigration and electrostatic discharge are now mandatory for robust volume manufacturing. And we’re not just dealing with IR drop. In a platform solution, IR drop is one small item. You have to consider a full-chip power model.”

No simple answers
In addition to understanding power throughout the flow, Apache has been a strong advocate of understanding power over time, a necessary perspective that further complicates the design process with a fourth dimension. Power can be affected by a number of factors over time—even small increments of time from one die to the next.

“Die-to-die interactions are a form of variability,” said Riko Radojcic, director of design for silicon initiatives at Qualcomm. “You need a timer that understands thermal gradients and the impact of thermal gradients on time. There is a gap there right now.”

How to solve this problem is a big unknown, particularly when it comes to power. Power models and power numbers are dynamic rather than fixed, needing adjustments and tweaking throughout the life of the design and even beyond.

The general consensus is that none of this will ever be automated beyond a certain point, and no single tool will handle all of the power issues—even in 2D designs. In 2.5D and 3D, the number of options and possible interactions increases non-linearly. As the industry progresses into the next dimension, one of the biggest challenges will just be grasping all of the possibilities—and all of the subsequent effects that go along with those options.