Stacked die approach is steadily gaining share, but there are lots of surprises as to where, how and why.
A decade ago top chipmakers predicted that the next frontier for SoC architectures would be the z axis, adding a third dimension to improve throughput and performance, reduce congestion around memories, and reduce the amount of energy needed to drive signals.
The obvious market for this was applications processors for mobile devices, and the first companies to jump on the stacked die bandwagon were big companies developing test chips for high-volume devices. And that was about the last serious statement of direction that anyone heard involving stacked die for a few years.
What’s becoming apparent to chipmakers these days is that stacked isn’t fading away, but it is changing. TSMC and Samsung reportedly are moving forward on both 2.5D and 3D IC, according to multiple industry sources, and GlobalFoundries continues its work in this area—a direction that will get a big boost with the acquisition of IBM’s semiconductor unit.
“It’s still very definitely an extreme sport for some of our customers,” said Jem Davies, ARM fellow and vice president of technology. “It looks really exciting and it’s possibly the case that somebody who gets this right is going to make some significant leaps. There are some physical rules in this, though. If you can reduce the amount of power dissipated between a device and memory, or if you’ve got multiple chips, the amount they dissipate talking between them is a lot. If you’ve got a chip and memory, the closer you can get these things to each other, the faster they’ll go and the less power they will use. The Holy Grail here is in sight. There are a number of technologies that we’re seeing people looking at using.”
Full 3D stacking with through-silicon vias has gained some ground in the memory space, notably with backing from Micron and Samsung for the Hybrid Memory Cube. But the real growth these days is coming in 2.5D configurations using a “die on silicon interposer” approach, which leverages the interposer as the substrate on which other components are added, and to a lesser extent interposers connecting heterogeneous and homogeneous dies. Typically these packages are being developed in lots of less than 10,000 units, but there are enough of designs being turned into production chips that questions about whether this approach will survive are becoming moot.
“People are becoming a lot more comfortable with this technology,” said Robert Patti, CTO at Tezzaron. “Silicon interposers are much more readily available. You can get them from foundries. We build them. And there are some factories now being built in China to manufacture them.”
One of the key factors in making this packaging approach attractive is the price reduction of the interposer technology. Initial quotes several years ago from leading foundries were in the $1 range for small interposers. The price has dropped to 1 to 2 cents per square millimeter for interposer die.
“This is now PCB-equivalent pricing,” said Patti. “This used to be only for mil/aero, but we’re seeing these in more moderate numbers. They’re being manufactured in batches of hundreds to thousands to tens of thousands. We’ve had people look at this for high-end disk drives. The low-hanging fruit in this area is high-bandwidth memories with logic. The focus is on high bandwidth, and power comes along for the ride.”
What’s less clear is whether interposers, such as Intel‘s Embedded Multi-die Interconnect Bridge, which is available to the company’s 14nm foundry customers, and organic interposers, which are more flexible but at this point more expensive, will be price competitive with the silicon interposer approaches.
But cost is a relative term here, and certainly not confined just to the cost of the interposer. The semiconductor industry tends to focus on price changes in very narrow segments, such as photomasks, while ignoring total cost of design through manufacturing. That’s true even for finFETs, where the focus has been on reduced leakage current rather than big shifts in thermal behavior, particularly at 10nm and beyond.
HiSilicon Technologies, which designs production 2.5D chips for Huawei, submitted a technical paper to the IEEE International Reliability Physics Symposium in April that focuses on localized thermal effects (LTE), which can affect everything from electromigration to chip aging. The paper identifies thermal trapping behavior as one of the big problems with finFETs at advanced nodes, saying that the average temperature of finFET circuits is lower due to less leakage, but “temperature variation is much larger and some local hot spots may experience very high self-heating.”
Planning for LTE isn’t always straightforward. It can be affected by which functions are on—essentially who’s using a device and what they’re doing with it—and how functions are laid out on the silicon. And it can be made worse by full 3D packaging, because thermal hot spots may shift depending upon what’s turned on, what’s dark, and how conductive certain parts of the chip are.
“The problem there is thermal hot spot migration,” said Norman Chang, vice president and senior product strategist at Ansys. “At a 3D conference one company showed off a DRAM stack on an SoC. The hotspot was in the center of the chip when it was planar, but once the DRAM was added on top, the hot spot moved to the upper right corner. So the big issue there is how you control thermal migration.”
Chang noted that that 2.5D is comparable to the thermal gradients in planar architectures, where keeping 75% of the silicon dark at any single time for power constraints generally keeps it cool enough to avoid problems.
Time to market
Cost is a consideration in the time it takes to design and manufacture stacked die, as well. One of the initial promises of stacked die—particularly 2.5D—was that time to market would be quicker than a planar SoC because not everything has to be developed at the same process node. That hasn’t proven to be the case.
“The turnaround time is longer,” said Brandon Wang, engineering group director at Cadence. “If you open a cell phone today and look at what’s inside, there are chips all over the board. That requires glue logic, so what happens is that with the next generation companies look at how much they can get rid of. With a silicon interposer you can fit chips into a socket easier. It takes longer, but it helps people win sockets.”
One thing that has helped is that the heterogeneous 2.5D market is mature enough for design teams have some history about what works and what doesn’t. Over time, engineers get more comfortable with the design approach and it tends to speed up. That same trend is observable with double patterning and finFETs, where the initial implementations were much more time consuming than the current batch of designs. Whether it will ever be faster than pre-characterized IP on planar chips is a matter of debate. Still, at least the gap is shrinking.
But there also are some distinct advantages on the layout side, particularly for networking and datacom applications. While designs may not be quicker at this point, they are cleaner.
“Complex timing sequences and cross-point control are where the real benefits of 2.5D show up,” said Wang. “Cross point is a signal that crosses I/O points, and the tough thing about cross point is data congestion. By going vertical you provide another dimension for a crossover bridge.”
Testing of 2.5D packages has been proved to be straightforward, as well. Full 3D logic-on-logic testing has required a highly convoluted testing strategy, which was outlined in the past by Imec. And more recently, the push for memory stacks on logic has resulted in other approaches. But with 2.5D it has been a matter of tweaking existing tools to deal with the interposer layer.
“You can still do a quick I/O scan chain and run tests in parallel, so there is not a large test time,” said Steven Pateras, product marketing director for test at Mentor Graphics. “You can access the die more easily. The only complication is with the interconnect, and that’s pretty much the same as an MCM (multi-chip module). That’s well understood.”
While stacked die pushes slowly into the mainstream, there are a number of other technologies around the edges that could either improve its adoption or slow it down. Fully depleted SOI is one such technology, particularly at 22nm and below, where performance is significantly faster than at 28nm and where operating voltage can be dropped below the voltage in 16/14nm finFETs.
CEA-Leti, for one, has bet heavily on three technology areas: FD-SOI, 2.5D, and monolithic 3D, according to Leti CEO Marie-Noëlle Semeria. “We see the market going in two ways. One will be data storage, servers and consumer, which will need high performance. That will be a requirement (and the opportunity for 2.5D and 3D). Another market is the broad market for IoT, which still has to be better defined. That will include the automotive and self-driving market, medical devices and wearables. For that you need technology with very good performance, low power and low cost. FD-SOI can answer this market.”
Others are convinced that stacked die, and in particular 2.5D, can move further downstream as costs drop and more companies are comfortable working with the technology.
“Right now, 2.5D is at the server and data center level, and it will certainly be in more servers as time goes on,” said Ely Tsern, vice president of the memory products group at Rambus. “But we also see it going forward as manufacturing costs drop and yield increases.”
That’s certainly evident at the EDA tool level, where companies are doing far more architectural exploration than in the past. But whether that means more 2.5D or 3D designs, and how quickly that shift happens, is anyone’s guess.
“Right now there is interest in exploring multiple architectures that could change overall designs,” said Anand Iyer, director of marketing for the low power platform at Calypto. “The big question people are asking is how you save power and keep the same performance level. 2.5D is one way to reduce power, and it’s one that many people are comfortable with. MCMs existed before this and people are quite familiar with them. The new requirement we’re seeing is how to simulate peak power more accurately. There are more problems introduced if power integrity is not good.”
Iyer noted that in previous generations, I/O tended to isolate the power. At advanced nodes and with more communication to more devices, power integrity has become a challenge. 2.5D is one way of helping to minimize that impact, but it’s not the only way.