Better standards, 3D DFT, and next-generation probes are a great start toward fully testing these complex systems.
Improved testability, coupled with more tests at more insertion points, are emerging as key strategies for creating reliable, heterogeneous 2.5D and 3D designs with sufficient yield.
Many changes need to fall into place to make side-by-side 2.5D and 3D stacking approaches cost-effective, particularly for companies looking to integrate chiplets from different vendors. Today, nearly all of the chiplets being used in multi-die systems or packages are developed by a single company. But as chipmakers move to leverage third-party chiplets developed by a multitude of vendors, they will need a combination of new design tools, equipment, and methodologies. [1]
“With chiplets, you have a reconstituted wafer, or they can use a die handler and test the parts. But they need to do some high-power testing, some at-speed testing, along with functional testing,” said Dave Armstrong, principal test strategist at Advantest. “It’s really beneficial to test the parts after thinning and after dicing, because you can get chips and cracks and other things. And being able to test at-temperature at the die or wafer level is advantageous.”
Test engineers are turning to enabling approaches in the transition from monolithic SoCs to chiplet-based systems, including:
From monolithic to multi-chip test
High-density interfaces, such as the Universal Chiplet Interconnect Express (UCIe), pave the way for connecting various chiplets together. But leveraging those for testing purposes is a work in progress.
“UCIe is a start,” said Ken Lanier, business development manager at Teradyne. “It allows you to get data from one die to another efficiently. I am not sure there is consensus on what you would do with all the data sent on the bus, though. There is some idea that there is a need to do a silicon check — like BiST or Scan — using the bus. The challenge is getting two different silicon providers to agree on what happens when their devices are connected together to validate the complete assembly.”
The Heterogeneous Integration Roadmap also is undergoing revision. “The update will focus heavily on system-level test and data analytics as key drivers for chiplet-based testing,” Lanier said. “When testing individual die, it may be difficult to physically contact TSVs or other small structures. The major challenge going forward is the testability of completed assemblies, and finding faults in a cost-effective way.”
That roadmap increasingly is influenced by the density of interconnects and power. “It used to simply be Moore’s Law and 5nm, 3nm technology and a 1,000-pin BGA,” said Advantest’s Armstrong. “Now, interconnect density per millimeter and power connect density per millimeter have become ways to address the range of what the industry is doing, and that’s a big change. It allows us to make real progress.”
Fig. 1: Fundamental testing differences for heterogenous integration requirements including DFT compatibility, observability of chips, data security. Source: Teradyne
The risk associated with chip failures is compounded in multichip packages. “Each individual chiplet must function as designed after integration into the final package,” said Jay Rathert, senior director of strategic collaborations at KLA. “One failed chiplet — either due to escape of a low reliability die, or because of packaging assembly defects — can bring down the entire device, magnifying the yield loss and financial impact.”
Those kinds of issues are raising concerns across the supply chain. “The fundamental problem is how to get the most optimal implementation that is manufacturable and testable with the least risk and in the most productive way,” said Shekhar Kapoor, senior product director for multi-die systems at Synopsys. “There are existing tools and approaches that can be used to implement multi-die systems. However, to address the increased complexity of heterogenous integration systems, and to do it efficiently while reducing the risks, requires a more holistic approach. It really boils down to having a comprehensive solution including very diligent architecture design, hardware-software partitioning, chiplet design planning, and integrating manufacturing testability. Other critical considerations include thermal and power delivery earlier in the design cycle.
It also is based on the assumption that all of the chiplets being used are known good dies (KGDs).
New test flows
Fig. 2: Nine potential test moments for a three-chip stack. Source: imec
With monolithic SoCs or chips, engineers typically perform two main tests — one at wafer probe, and the second after assembly and packaging, or package test. However, with multi-chiplet ICs, the number of potential tests points can increase significantly (see figure 2).
“In a three-die stacked IC, prior to bonding, wafer test can be performed before thinning or after thinning. Wafer bonding is typically by thermo-compression, and so you want to make sure that you dies have survived the bonding,” said Erik Jan Marinissen, scientific director at imec. “If you don’t do any prebond test, you might have a bad die in your stack. And once they’re stacked, there’s no way to replace them, rendering the entire stack worthless. The key challenges are testing on small microbumps and testing on very thin wafers.”
In imec’s case, wafers as thin as 50µm are used. “For us, 50 microns is the sweet spot, Marinissen said. “We have thinned down to 10 microns, but then you start to affect the wafer, lose speed on your transistors, for example, and it becomes too thin to handle. If you go too thick, the TSVs will take a lot of time to form and they become too expensive.”
Fig. 3: Large-array, fine-pitch microbumps are tested directly, avoiding the use of dedicated prebond pad tests. Source: imec
Imec also regularly works with both 50µm TSVs in active wafers and 100µm TSVs in interposers. Newer wafer probe stations can handle 300mm wafers on SEMI standard 400mm tapeframes.
Chip stacking
3D-ICs use hybrid bonding, which involves die-to-wafer or wafer-to-wafer connections that deliver up to 1,000X more connections per unit area than copper microbumps. Direct copper interconnects between chips drive signal delays to negligible levels, while accelerating bump density by three orders of magnitude compared with 2.5D integration schemes.
Hybrid bonding joins dies to wafers using pick-and-place systems. “When it comes to die-to-die stacks, they’re more difficult to test because you have to place them one by one. Alternatively, we pick-and-place them as a matrix onto a carrier, for example a tapeframe, and include automatic index stepping, but you have to take care of how they are placed. The x-y translation as well as small rotation (<2º) can be compensated by adjusting the wafer chuck,” said Marinissen.
Probing of microbumps has progressed rapidly (see figure 4). Each layer must be functionally verified before going into the advanced package die stack, and may include up to 4,000 test points called microbumps that lead to data paths, power supplies, and ground. Microbump test points are positioned extremely close to each other, often within 40μm. Each presents a very small probe target, on the order of 15 to 25μm.
Fig. 4: Both probe size and target areas are shrinking to accommodate increasing interconnect density. Source: imec
Enabling standards
In 2011, imec’s Marinissen initiated the standardization effort for testing 3D-ICs, setting standards for DFT. In 2020, IEEE Std 1838 was introduced, which enable stacked die to communicate with testers, and the use of DFT to communicate between and within non-contacting dies. Cadence, Siemens EDA, and Synopsys all support 1838-compliant designs, and they can automatically insert compliant DFT hardware into chips.
Fig. 5: IEEE Std 1838 provides a consistent test access architecture for stacked dies. The serial control module (SCM) sends configuration instructions into the bottom die. The flexible parallel port (FPP) sends test stimuli into the bottom die and receives test responses at the external interface at the stack bottom. The FPP may be used for scan test data.
The standard (see figure 5) allows a consistent stack-level test access architecture. It has three main elements:
“The 1838 standard actually supports not just stacking, but also 2.5D interposer types of designs,” said Adam Cron, distinguished architect at Synopsys and chair of the 1838 working group after Marinissen. “Basically, we needed a way to kind of link the bottom die of the package, if you will, in the context to the tester or the board to the rest of the stack. And so IEEE Std 1838 leveraged 1149.1, 1687, 1500, and other standards that people know about to make the link as painless as possible. Synopsys supports IEEE Std 1838, adding the hardware to chip designs via RTL, and it’s our belief — and probably the community’s belief — that DFT is not really the place where designer creativity needs to shine. It’s really a place for interoperability and integration automation.”
Design tools have been created or modified to support the standard. “Some other tools, for example the 3DIC Compiler, can use the fact that there is 1838 in different die to actually check that those dies are connecting the right ports for 1838 connectivity, which is a powerful feature,” Cron said. “It also can leverage the physical information to generate fault lists for ATPG to do the die-to-die interconnect, even if there’s combinational logic in the pathway.”
Others point to similar benefits. “IEEE 1838 provides a great mechanism for die-level test access by using the SSN bus (FPP) and IJTAG (PTAP/STAP). The other main benefit of IEEE 1838 is the mechanism for improved coverage with die-to die scan capture,” said Joe Reynick, technology enablement engineer in the Tessent Division at Siemens EDA.
“IEEE 1838 is flexible enough to allow usage of technologies such as IJTAG and Streaming Scan Network (SSN) that greatly simplify access and reduce test cost,” added Jay Jahangiri, director of product management for DFT in Siemens’ Tessent Division. “There can be very significant test volume and cost if the DFT engineer runs tests on the full 3D stack for package test. However, some vendors have the ability to ‘graybox’ everything in the chip, to quickly generate die-to-die scan vectors with low test volume for the whole stack. With this, engineers can quickly retarget pre-existing chiplet ATPG vectors through the SSN bus.”
Reynick pointed to another concern involving chiplets from different companies. “How do we support traditional compression/scan in the middle die, and SSN in the other dies? In this case, we can use the SSN bus as just a simple DFT FPP bus to connect with the traditional compression I/Os in that middle die. All the timing IPs, like pipelines, are in the SSN bus. We can bypass SSN in entire chiplets to get to that middle die with the FPP. We can also control test modes and test setup through the PTAP and STAP. There is room for more additions to 1838, like a common FPP bus interface width for all chiplet vendors. Common chiplet interfaces is also a functional issue that needs to be further resolved.”
Cron emphasized that when it comes to standardizing interfaces, as with UCIe and others, DFT should be considered, as well. “The folks who generate the standards for interfaces that are die-to-die might focus a little bit on the DFT, as well, in devising the standard, because that’s typically not where they spend their quality time,” he said. “They’re really focusing more on how to get the high-speed functionality. But nothing’s going to go across that interface before it’s tested, and it actually proves performance.”
Another goal is to make DFT transparent to the user. “Synopsys uses physical information, for example, for a lot of its DFT generation like scan chain ordering to reduce congestion. It’s important to make sure test points don’t clog up the works, such as SLM IP location and connectivity,” added Cron. “You just want to make sure that when you put that silicon lifecycle management and measurement IP down, that you don’t clog things up functionally. And make sure that it’s useful out in the field, not just in manufacturing.”
Some of the most critical challenges with DFT guidance have to do with the 3D effects of integrating chiplets. “Ultimately, we have to add significantly more work to verify an assembled circuit of chiplets than with a single die,” said John Ferguson, director of Caliber nmDRC at Siemens Digital Industries Software. “We have to ensure each chiplet is aligned in three dimensions, and that designed circuitry matches the intended electrical behavior. Also, we historically know how to account for the delays in wires within a chip by extracting the ‘parasitic’ elements. But how does one account for potential parasitic coupling between stack dies?”
Establishing known good die
Test and yield engineers are stepping up efforts to ensure KGD at wafer probe. For instance, Advantest and PDF Solutions offer dynamic parametric test, designed for early detection of out-of-spec parameters, which automatically triggers additional testing while the wafer is still at the probe station.
Adaptive test is implemented using a rules based engine on Advantest’s V93000/SMU8 tester integrated with PDF Solution’s Exensio software. “I’m going to collect information on additional sites near the anomaly, so I may test all the surrounding structures around the location of interest. Or I may collect additional measurements above and beyond what I would do normally, said Ken Butler, senior director of business development at Advantest. “Maybe I test at one voltage normally. Now I’m going to sweep 10 voltages and collect a bunch of additional data, because the idea is that if you have an anomalous response, the material could be maverick material. You want to be able to quickly get to the root cause.”
Fig. 6: The V93000 Dynamic Parametric Tester (DPT) uses PDF Exensio DPT to trigger a revised test recipe when an out-of-spec parameter is detected with the goal of quickly performing root-cause analysis. Source: Advantest
Advantest is creating an open ecosystem for vendors to build analytics or other tools. “A lot of the previous solutions in the space were closed systems, where if you need any new capability or new analytics, you always have to go back to the single supplier to do that,” Butler said. “That’s difficult and cumbersome, for a lot of reasons. We’re trying to make it easy for people to be able to bring to bear their own solutions.”
Data security must be built into such an open platform. For example, Advantest offers a secure analytics compute engine that handles demanding workloads, ACS Edge. “Most of our customers are very concerned about maintaining security of their data,” said Butler. “They’re running high volumes of devices in a third-party contract test-house-type operation, and they want to protect their IP and the analytics they’ve developed. So the edge server connects to the host controller using a 10 Gb/s high-performance secure link that only those two can talk to. It is deployed with a zero trust implementation, assuring customers have a secure platform on which to confidently deploy the analytics they developed.
Others are working on similar issues. “Test uncertainty on whether a chiplet is a known good die can result from incomplete functionality, economic tradeoffs on test coverage, or latent defects,” said KLA’s Rathert. “Adding inline defect data from each chiplet’s manufacturing history can help reduce this uncertainty. KLA’s I-PAT (Inline defect Part Average Testing) simplifies the complex fab defect data stream, using machine vision and AI on screening inspection systems to numerically score each chiplet’s risk from defectivity. This enables packaging lines to immediately reject outlier die and enhance test insertion decisions on the remaining chiplets to increase the confidence of designation as a KGD and suitability for inclusion in a package.”
X-ray inspection is used for in-line detection of voids and missing connections in 2.5D and 3D devices. “Some of our X-ray products can examine the structure of the connections — are they damaged? Are they bridging? Are they not connected because they were not bonded properly” said Frank Chen, director, applications and product management at Bruker’s Nano Surface and Metrology. “So X-ray can reveal those hidden structural defects.
Machine learning (ML) can play a role in ensuring KGD, as well, performing analytics on devices that are not necessarily out of spec but may be demonstrating aberrant behavior that can be examined using algorithms.
“In test, the traditional use has been to optimize test flows and provide process feedback using measurements made at device I/Os during production tests,” said Teradyne’s Lanier. “What makes ML more exciting today is that there is an explosion of available data, mostly driven by new die sensor technology, which allows visibility to, and measurement of, what is happening at different points on the die itself in terms of voltage, current, temperature, and many other parameters. This could aid in tuning device performance and not using devices that might otherwise fail during production test.”
Probe technology has advanced from the common cantilever probes, which probe a limited number of sites, to vertical probe needles, which can probe tight arrays, to MEMS needles. Imec is preparing the evaluation of 25µm pitch probe cards from FormFactor.
System-level test
Even before heterogenous integration, chipmakers were increasing adopting system-level test for complex SoCs. “System-level testing is a very real trend in the industry and is grabbing some people by surprise,” said Advantest’s Armstrong. “Industry leaders like NVIDIA, AMD, Intel, and the like have been doing it for years, but now it’s becoming mainstream. And people want to do more testing sooner, which is the real trend here.”
Teradyne’s Lanier pointed to similar benefits. “Our testers offer particular advantages such as better instrument accuracy, which allows our users to use lower measurement guardbands to improve quality and yield,” he said. “Our high instrument resource count availability means we can support higher site counts in order to reduce test costs.”
Cost modeling
Imec co-developed cost-modeling software with Delft University of Technology, called 3D-COSTAR. The software aims to optimize the test flow of 3D stacked ICs, taking into account the yields and costs of design, manufacturing, packaging, test, and even logistics. [2] “You can look at yield and determine how much you can improve your yields by testing, or find out if it’s economical to do a certain test. For example, you can determine whether it makes sense to use more expensive probe cards,” said Marinissen.
The 3D-COSTAR produces three key analysis parameters — product quality in terms of test escape rate (expressed in defective parts per million), overall stack cost, and breakdown by cost type. “3D-COSTAR has proven to be a crucial tool to analyze the many complex tradeoffs in 3D test flows, in terms of both cost and DPPM,” said Marinissen.
Conclusion
2.5D and 3D stacked die designs are progressing with new probers, probe needle technology, new standards, and test optimization including cost modeling. As the industry becomes more experienced with chip stacking and testing chiplet-based devices, the list of best practices will grow.
Going forward, experts anticipate more standardization and DFT tool automation to make multi-die chiplets easier and less costly to implement.
“DFT for low power will probably get better integrated,” said Synopsys’ Cron. “We have tools that can predict some power early on and we can leverage that to manipulate the DFT for lower power. But mating the DFT up with power analysis is probably coming. And folks are talking about having spare cores and being able to trim out the bad ones, either at manufacturing or maybe even out in the field. The industry is also migrating to implement more security into the DFT, so implementing safety, security and DFT together is for sure coming.”
References
Leave a Reply