Where Does It Hurt?

New pain points highlight issues in advanced design and preliminary approaches to stacked die.


By Ed Sperling
The IC design industry is feeling a new kind of pain—this one driven by uncertainty over architectural shifts, new ecosystem interactions and new ways to account for costs.

As mainstream ICs move from 50/45/40nm to around 32/28/22nm, there are only two choices for design teams—continue shrinking features or stack dies. In many cases, the ultimate solution may be a combination of both, but the lack of experience in stacking die and the growing body of physical effects from features, not to mention the skyrocketing cost of verifying a much more complex chip, are making all of these choices difficult.

For stacked die, the big challenge is dealing with a blank slate. Best practices don’t exist yet, which means the opportunity for experimentation—and to get something wrong—are enormous.

“There is a change in dynamics and business models,” said Hans Bouwmeester, senior director of IP at Open-Silicon. “Whenever you have new technology there are no standards, so the world we’re working with is still very, very big. What’ we’re doing right now is focusing on the die-to-die interface and how to communicate with different dies using interposers and TSVs.”

Bouwmeester said that with chip-to-chip communication, the focus was on big I/Os and serialization using SerDes. With die-to-die communication using interposers and TSVs, suddenly it’s possible to use thousands of very small traces between die and it may not even be necessary to use serial interfaces. ”

A consistent theme in IC design these days, even with planar designs, is that there are too many choices. There also frequently is too little information about each IP block—which essentially is a black box—to deal with all the possible interactions. That gets worse in stacked configurations. What happens, for example, when a logic block created at 28nm is put next to a memory with a logic layer at 20nm in a stacked die or 2.5D package? At this point, no one knows.

“You can’t deal with all the choices anymore,” said Mike Gianfagna, vice president of marketing at Atrenta. “In a planar SoC, early planning and careful examination of the architecture is a ‘nice to have.’ When you get to 3D, where you have multiple technology and interconnect choices, the number of choices explodes. At that point, early planning becomes a ‘need to have.’ The good news is that this brings up a new opportunity for renewed growth in EDA. But it’s certainly going to make things more difficult.’

Design chain issues
All of this reverberates down into the design chain, which is why there is a flurry of activity these days in so many areas. In addition to just IP, sales of verification IP (VIP) are picking up significantly. Mentor Graphics, Synopsys, and Cadence, as well as processor developer ARM, all offer VIP with their IP.

But beyond that, there also needs to be much more understanding about what works where and why—and what the potential implications could be if something doesn’t work right.

“We’re going through the process now of identifying pain points,” said Jack Browne, senior vice president of sales and marketing at Sonics. “The top ones are IP integration, high frequency (faster performance), memory throughput, physical design and power management. It’s all more complicated. As we add more channels and more bandwidth, we’re also adding channels to the memory system. So the software guy now has to decide which part of the code goes on channel one and which parts go on channel two—and when.”

He said the decisions increase significantly with multichannel configurations and TSVs. That means more complicated simulations. But at the same time chipmakers also are looking for more granularity to be able to change something in a design if necessary.

“The reality is that people are trying to understand the corner cases better,” he said. “But now you have up to 10 or 12 cores for CPUs and graphics, you may have up to 32 memory threads, and you’ve got hundreds of thousands of instructions to turn things on and off—and throughout this everyone is trying to figure out how to get rid of leakage.”

Ecosystem issues
At least part of what’s causing this overload is a lack of experience in working with stacked die. Just having high-volume chips that work in stacked configurations would go a long way toward resolving many of these issues. So would understanding who’s doing what.

“The order is changing in the supply chain,” said Tom Quan, director at TSMC. “It used to be where you had chips on a wafer and you could tell what are the known good die and keep the whole wafer. Now it’s all heterogeneous. You may dice them and put them in a package from someone else, or you may put the package on the substrate.”

Quan noted that if there are known good die, yield will be sufficient for those die. But no one knows yet what the yield of the stack will be—or the cost. “The early adopters will not be focusing on low cost. It’s a combination of high performance and low power that will drive this technology. You also can take out all of the I/O paths to memory and get rid of that circuitry. You decrease the congestion, the RC, and the timing is faster.”

How much lower power, and how much higher performance, no one knows for sure. And of particular concern is that no one knows who’s responsible is something goes wrong. While the chain of responsibility is improving and better defined than it was even six months ago, there are still gaps.

“The interposer was via middle, which puts EDA tool responsibility on the OSATs for abstraction support,” said Steve Eplett, design technology and automation manager at Open-Silicon. “The foundry is saying they stop processing here.”

With via middle, the foundry drills the via into the wafer and adds the necessary metal layers. The OSAT meanwhile, does the backside grind and determines how tall the via is. From there the process normally would account for the parasitics, but the foundries aren’t necessarily forthcoming with that information. Until the kinks are worked out and the business model solidifies, companies are getting proprietary about who owns what because the technology is so new. So the handoff, as well as the flow of information that accompanies that handoff, aren’t well defined.

“This is the first time that pieces of silicon are not the responsibility of one company,” said Eplett. “And as we work through issues, there are some surprises. For example, with a test chip we’ve been developing there are a few things where companies said they will not support us. They’re not deal breakers, but they do require us to do more work.”

The upside of all of this is that experimentation with a revolutionary approach to chip architecture will produce some revolutionary ideas about how to build these devices, how to manage the supply chain more efficiently, and how to maximize performance and minimize power. The rule book for stacked die hasn’t been written yet, which is both good and bad.

As those rules become more hardened, the number of choices will become more manageable, but experimentation will go down. Most companies believe that will result from a mass-produced design with hundreds of millions of units, even though one of the real advantages of stacked die is the ability to reach very specific markets quickly using stacked-die platforms.

STMicroelectronics, for one, has been working with partners to deliver 2.5D and 3D chips, most recently including Wide I/O that connects a fast access memory using microbumps and TSVs to a multi-core application processor.

“There are several different challenges to anticipate, and one of the objectives of the 3D device was to help solve them,” said Laurent Le-Pailleur, the 32nm/28nm Technology Line Management director for front-end manufacturing and process R&D at ST. “There are purely technical aspects—architecture, CAD flow, test engineering, process bricks and assembly, but some are also financial such as the associated business model between end customers, system-on-chip and memory providers. In term of a roadmap, heterogeneous multi die stacking has been produced at ST for more than 10 years, with multiple wire-bonding techniques at that time. It is now getting quite important across multi application domains, from MEMS to imagers.”

Still, the supply chain needs to be reconfigured to deal with some of these changes, and it remains to be seen is how that reconfiguration will play out from from the standpoint of who’s calling the shots and where future consolidation and startups will be. This is just the beginning of a major shift, and the pieces are only starting to align.