Stacked die and fan-outs are rapidly gaining steam, but not all the pieces are in place to make this a seamless process. Here’s what’s changing and what’s still missing.
A handful of big semiconductor companies began taking the wraps off 2.5D and fan-out packaging plans in the past couple of weeks, setting the stage for the first major shift away from Moore’s Law in 50 years.
Those moves coincide with reports of commercial 2.5D chips from chip assemblers and foundries that are now under development. There have been indications for some time that this trend is gathering steam. Equipment makers have been talking with analysts about how advanced packaging will affect their growth plans. After almost a year of delays, high-bandwidth memory was introduced into the market earlier this year. And there have been announcements by foundries and OSATs that 2.5D chips are now in commercial production, with many more on the way.
Still, the process is far from smooth. It’s not that chips can’t be built using interposers or microbumps or even bond wires to include more of what used to be on a PCB in a single package. But in comparison to the supply chain for planar CMOS, stacking die is a comparative newcomer. The tens of billions of dollars spent on shrinking planar features dwarfs the amount that has been spent on packaging multiple chips together, despite the fact that multi-chip modules have been around since the 1990s. Foundry rules are still under development. Some EDA tools and IP are available, but more still need to be optimized for stacked die configurations. And experience in working with these packaging approaches remains limited, even if they are gaining traction.
Nevertheless, chipmakers, IP vendors, packaging houses and foundries are pitching a different story than they were at the beginning of the year. Most now have some sort of advanced packaging strategy in place—a recognition of just how expensive it has become to develop chips at 16/14nm, 10nm and 7nm, and how much business they’ll leave on the table if they don’t recognize many chipmakers won’t go there.
Marvell, for example, has just begun rolling out what it calls a “virtual SoC” 2.5D architecture called MoChi, with the first LEGO-like modules to be added throughout the remainder of 2015 using internally developed interconnect technology.
“The problem is not just cost anymore,” said Michael Zimmerman, vice president and general manager of Marvell’s connectivity, storage and infrastructure business. “It’s the total development effort measured in dollars and years. There are not many suppliers that can justify spending billions of dollars, the time it takes to get these chips to market, and the resources required to make that happen. The goal is to restore reasonable time-to-market by going in a reverse direction. Instead of massive integration, you can break the chip into parts and separate the problems into modules. That allows the pace of innovation in each die to be separate from other die.”
He noted that initially there was a lot of skepticism about the approach, but in the past few months that skepticism has evaporated. “When you consider that the interconnect is 8 gigabits per second for one serial connection, and you can put 25 wires in a 1mm space, that means you can have up to 50 gigabits per second die to die with latency of 8 nanoseconds.”
Similar stories are being repeated more frequently across the industry. ASE Group has been working with AMD since 2007 to bring 2.5D packaging to market.
“We had a cost issue with the interposer,” said Michael Su, AMD Fellow in charge of die stacking design and technology at AMD. “But we have managed to decrease that to a better price point. Two years ago, the technology was still in the development stage. Since then we’ve decreased the number of features, added yield learning and now there are multiple players making interposers.”
The result is a graphics card for the gaming market that is 40% shorter—small enough to fit on a six-inch PCB—runs 20° C cooler at 75° instead of 90°, and which is 16 decibels quieter. It also offers a 2X performance increase over previous versions based on GDDR5 and twice the density, which allows system makers to turn up the performance increases in other parts of the system without exceeding the power budget. And with interposers now available commercially from most of the major foundries, Su said the prices will continue to decline.
Put in perspective, though, this was not a trivial project. It took eight years of ironing out the kinks and thousands of iterations of chips to get to that point—and a huge investment by both AMD and ASE.
“There are 240,000 bumps that we needed to connect together,” said Calvin Cheung, vice president of business development and engineering at ASE. “You have to make sure every one of those is connected. We also had to select the right materials and equipment, and to figure out how to pick the right piece of equipment.”
Cheung noted that a lot of the costs involved are proportional to volume, meaning prices will drop once volume increases and yield, materials and architectural design are mature enough. But he added that the value of integrating different components on multiple die cannot be overstated, because it allows flexibility to be able to target multiple market segments with minimal effort and time.
IBM Microelectronics has been working on this technology for at least a decade, as well. Now part of GlobalFoundries, the combined company is shipping 2.5D and full 3D IC parts using through-silicon vias. Gary Patton, the company’s CTO, is watching similar trends unfold. “We’re definitely seeing an uptick in requests for quotes for 2.5D solutions,” he said. “As the volumes increase, it helps drive the cost down. And then people see it’s shipping and start to realize this is real and they can use it.”
The story is much the same at TSMC. “The vector continues for the high-performance camp,” said Tom Quan, one of the foundry’s directors. “It offers better bandwidth, and for the consumer market it can be done in high volume with low cost. Some of this will use a silicon interposer. But even if you do away with the PCB and put it all in a package, you get better results.”
TSMC’s offerings in this area come in two flavors, a fanout technology it calls InFo (integrated fanout) and a full 2.5D approach it calls CoWos (chips on wafer on substrate). Quan said the advantage of CoWos is that it can integrate the highest-performance die using the latest technology with analog sensors at older technologies. “This is a big market. It includes IoT, automotive, and high-performance computing. CoWos will address the high-performance needs, InFo will address the other two.”
The first version of TSMC’s plan is expected to roll out in 2016. There are a couple of other iterations planned for InFo, including through-mold vias and through-InFo vias.
View from the trenches
This all sounds like the road to stacked die is fully paved, but companies involved in developing these chips are finding not everything is so perfect yet.
“If you look at planar silicon, from GDSII to the mask shop, there are well-defined specs,” said Mike Gianfagna, vice president of marketing at eSilicon. “If you pass the requirements, which are standard, then the downstream supplier can make the chip. That’s still missing in 2.5D. If you have warping problems, contact problems or yield issues, you don’t know that up front. And if the chip fails, you have to sign waivers that it’s your risk, not someone else’s.”
Gianfagna said what’s particularly troublesome is testing of the interposer. “We don’t have rules for that. It’s good enough to create a design, and you can build it into the cost of designs, but we’re still one to two years away from getting the benefit of yield learning and analysis so you can get chips out that are cheaper, more efficient and more reliable. This is still a big step forward, though. In the past we weren’t sure whether we could build it or that it would yield. We’re now beyond that, and a growing number of companies want to be out in front with this.”
The first companies to fully embrace these issues were DRAM manufacturers, which have been combining memory modules vertically to save space and reduce the distance signals need to travel. The Hybrid Memory Cube (HMC) and high-bandwidth memory (HBM) are both now fully tested and in commercial use.
“By increasing the density you get more performance, compared with more DIMM slots, which makes your system performance go down,” said Lou Ternullo, product marketing director at Cadence. “Customers are all asking for 3D support because they want to be ready.”
The big difference between the HMC and HBM is the interface. HBM uses microbumps, which means that today the only way to connect that to logic is through an interposer. So far there has not been much adoption outside of the graphics market, but Ternullo said that by the end of the year there should be about a half-dozen chips using HBM.
What changes in 2.5D and 3D is that the manufacturers take on more of the ecosystem role to overcome the known good die issues. Several sources say this is particularly important for HBM, because unlike DRAM it cannot be put through temperature cycles for testing. It has to be tested through the interface once the 2.5D package is completed, and the only way to do that is with built-in self-test (BiST).
Planning for 2.5D
Some of the big changes involve mindset, as well. Just as power and security need to be part of the up-front architecture of any chip these days, compared with earlier generations where they were an afterthought, so do things like how engineering teams are going to test the components in a 2.5D configuration, understanding how certain IP will yield compared with other IP, and understanding the interactions of analog and digital chips even if they aren’t on the same die.
“Design for test is one of the critical areas that needs to be considered,” said Asim Salim, vice president of manufacturing operations at Open-Silicon. “Proving microbumps has been a challenge for us. We now have some solutions. But two to three years ago we had to educate people that this is even needed.”
Integrating analog is another issue, and it varies greatly from one package to the next. Salim said that if an A-to-D converter is used to connect to other modules, for example, it will require a different kind of testing than if it’s connected to the ball grid array on the package. The first requires a power-on self-test, while the latter can use external test. Testability is one of the key areas, and getting it wrong can both increase the cost of the design and decrease its reliability.
Another area that needs to be considered up front is I/O coherency and what can be done with new architectural approaches. What’s possible with multiple die is more than what’s possible on a single die. “You can make two die behave like one die,” said Marvell’s Zimmerman. “You also can connect multiple cores on different die and turn it into many cores on different die.”
Full 3D-IC architectures are still expected to take at least a couple more years before they are commercially in use, according to a number of companies. But work has begun there, as well, say a number of industry sources. That problem is even tougher to solve, but compared with 5nm and 3nm, it may be a toss-up as to which approach is more difficult.
As with many developments in the semiconductor industry, when the entire supply chain begins turning direction the pieces line up quickly.
“We are really in the middle of the shift to 2.5D,” said Wally Rhines, chairman and CEO of Mentor Graphics. “That will drive tools to do the integration of the chip and the package.”
It also will drive new opportunities for companies that have bet on this technology as it matures and becomes much more flexible, with innovations occurring at the module level rather than across an entire chip.