Advanced Packaging Makes Testing More Complex

Why 2.5D, 3D, and other advanced packaging types are driving new standards and approaches to testing.

popularity

The limits of monolithic integration, together with advances in chip interconnect and packaging technologies, have spurred the growth of heterogeneous advanced packaging where multiple dies are co-packaged using 2.5D and 3D approaches. But this also raises complex test challenges, which are driving new standards and approaches to advanced-package testing.

While many of the showstopper issues have been resolved, this is still early days for the production of advanced packaging. Best practices and lower-cost methodologies will evolve as we transition from use by a select few to broad implementation.

“The advanced packaging market is a very dynamic and high-growth market,” said Subodh Kulkarni, president and CEO of CyberOptics. “While it is a high-end option for specialty applications, we believe it’s well poised to penetrate a number of different applications,”

But it’s not without problems. “Cost of test is the big challenge for production,” said Amy Leong, chief marketing officer at FormFactor. This creates the need for balance between the amount of testing done and the cost of any yield loss.

Several leading-edge applications use advanced packages. “For high-performance computing and high-end gaming processors, including HBM and GPUs, these applications typically include 2.5D and 3D packages,” said Pieter Vandewalle, general manager of KLA’s ICOS Division.

One of the drivers of this technology is the fact that this capability is becoming more widely available through foundries. “Semiconductor foundries are offering these packaging solutions,” said Vivek Chickermane, senior group director for R&D in Cadence’s Digital & Signoff Group. “In the past, only highly-integrated IDMs could do this.”

What is an advanced package?
Advanced packages combine more than one die (and potentially other elements like passives) within a single package. There are, fundamentally, two ways of doing this:

  • Laying dies next to each other on a substrate that supports very fine lines, referred to as 2.5D
  • Stacking dies atop each other, referred to as 3D.

2.5D integration involves some kind of substrate, and the specifics of the technology varies by manufacturer. The most commonly discussed substrate is a silicon interposer, because it supports extremely fine lines that can interconnect micro-bumps on a die with 55µm or even 40µm pitch — far closer together than traditional C4 (controlled-collapse chip connection) bumps, which have a typical pitch at or above 100 µm.

The challenge with silicon interposers is their cost, because they’re manufactured using typical chip-building fabs or foundries. In addition, they only can be as big as the photolithographic field can expose, although FormFactor noted that TSMC has made some progress using multiple exposures to build a larger interposer. Intel addresses this with its embedded multi-die interconnect bridge (EMIB), and Samsung uses organic “panels.”

3D integration moves up instead of out. Dies are stacked on top of each other. Where stacked face-to-face, micro-bumps or hybrid bonding can make the connection. Where one die attaches to the backside of another, however, through-silicon vias (TSVs) carry the signal from the active area of the die to the backside for connection to the die being stacked atop it. The TSVs carry their own risks: “TSVs have some specific defect mechanisms like cracks, incomplete fill, and pinholes in insulator walls,” said Chickermane.

Fig. 1:  A package with both 2.5D and 3D elements. Source: FormFactor

For a heterogeneous package, one die becomes the test access point for the entire assembly. For a 3D stack that naturally would be the bottom die, because it will contact the package and the outside world. For a 2.5D configuration there isn’t a natural choice, so one die must be given that role. All test signals to all dies will travel through that main die en route to and from the other dies.

The big challenge for testing is to optimize the number of test insertions balanced against the cost of scrapping material further down the line. There is no one right answer to that problem, but it is influenced by many factors.

The challenge of known-good dies
One of the keys to achieving good yields on assembled dies involves the notion of “known-good dies“ (KGD) — dies that have passed wafer sort. This idea is driven by a fundamental fact of yields.

“By adding dies into a package, you’re multiplying their yield,” said Rita Horner, senior technical marketing manager at Synopsys. Even with high-yielding dies, if they’re assembled blind, without testing, finished-product yields will be too low to be economic, even though the cost of die testing was saved.

Fig. 2: The need for pre-bond testing if yields on the die fall below about 86%. Source: FormFactor

Instead, assembly starts with KGD. But it’s not enough just to test room-temperature wafers. “’Known-good dies are tested to ensure ‘known-good at temperature’,” said Leong.

A bigger challenge for wafer probing comes from the micro-bumps themselves. These are so small and delicate that probing them is difficult. “Probe pins pretty much destroy the micro-bumps,” said Horner.

The micro-bumps also can be hard to access. “The TSVs are like a forest,” said Chickermane. “The peripheral ones are easiest to access.” Even if successfully probed, it may be difficult to assemble them reliably after.

A common solution to this is to use “sacrificial pads.” These are pads that are larger than the micro-bumps, but they also come with challenges. “The [sacrificial] pads don’t have bumps on them,” said Leong. “So it’s hard to probe through the forest of micro-bumps. As a result, there are design rules establishing about a 50µm keep-out zone around the pads.”

Fig. 3:  Sacrificial pads can be placed among the micro-bumps, keeping enough space around them so that they can be probed. Source: FormFactor

But, of course, you can’t put in a sacrificial pad for each micro-bump. Instead, sacrificial pads are selectively added, using empty space so as not to increase the die area. There are a number of ways to handle this:

  • Focus only on critical functions — particularly those that will never come outside the package, where they could be tested after assembly.
  • Create a shadow path that duplicates an existing path. The shadow path will be used for testing, under the assumption that it will suffer any failures that the true path will suffer.
  • Use them in a scan configuration or use multiplexers so that you can access multiple micro-bump signals with a single sacrificial pad.

“If you are putting a sacrificial pad there, use it to test as many interfaces as possible,” said Chickermane.

Horner described taking this to the extreme: “The sacrificial pads are mainly used for connections to JTAG interfaces where tests such as memory BiST, logic BiST, scan test, and I/O loopback tests can be performed to validate each of the functionalities in a die.”

If you can’t test every signal using sacrificial pads, then are you challenging the notion of “known good”? This becomes a practical economic issue. “The test costs to guarantee KGD are not typically a viable economic option,” wrote Mike Slessor, president and CEO of FormFactor, in a whitepaper. “Instead, we need economically feasible strategies to ensure ‘Good-Enough Dies.’”

Leong added that KGDs are on a sliding scale. “It always come down to the balancing act of test coverage to catch higher-probability/impact issues, while accepting risk of lesser issues slipping through to final test.”

If sacrificial pads are used, Leong noted that all micro-bumps should be accessed during characterization. Once the correctness and reliability of the die are known, production can transition to using the sacrificial pads.

It’s also important to consider that dies are deemed “known good” after wafer sort. Even before assembly, dicing the wafer up can introduce cracks and other defects, so it’s important to include those in tests — especially over temperature, which may activate the new faults. In addition, said John O’Donnell, CEO at YieldHUB, “Sometimes the performance of a die will be affected by the others.”

When several KGDs are stacked, they can then be tested. Those that pass are known-good stacks (KGS). Assembling a KGS on a substrate for further 2.5D integration improves the yield of the final unit.

Fig. 4: A typical advanced-packaging test flow. Individual dies are tested first and then again after stacking and assembly. Source: FormFactor

Standards provide known methods
The challenge of testing multiple dies through the limited interconnect provided by the external package connections is assisted by a number of standards. The most well-known of these is the JTAG (Joint Test Action Group) standard, formally referred to as IEEE 1149.1.

This is a long-standing methodology originally conceived for testing board connections between chips — that is, lines external to the chips themselves. It became popular because it also allowed the testing of internal chip signals through one or more internal scan chains.

Internal testing was formalized in IEEE 1687.  IEEE 1500 further supported the testing of blocks within a die by wrapping each block in a test wrapper. That wrapping approach has been further extended in IEEE 1838, which was published in March.

1838 is a combination of JTAG on the “main” die, and die wrappers for the other dies. It includes the notion of a “test elevator” for die stacks. “Use the bottom die to test the middle die and the middle die to test the die above, etc.,” said Chickermane. “A ‘test elevator’ takes test protocols up to the target die.”

Anyone who designs to the IEEE 1838 standard will have guaranteed test access to all of the dies. This makes it easier to use a very few sacrificial pads. “Through the JTAG interface, it is possible to run loopback tests using a PHY’s internal built-in pattern generator and checker without having access to every I/O pin,” said Horner. “Many PHYs have built-in self-tests, redundancy paths, and on-board scope capabilities that can be accessed through the die’s JTAG interfaces to enable wafer level tests.  Depending on the test methodology used in the die, all blocks could be accessible through the JTAG,” she added. “Test standards such as the IEEE 1149, 1500, 1687, and the newly released 1838 enable end-to-end test solutions for a multi-die system in a package.”

It should be noted that these standards address digital-signal testing, not analog signals. Any analog signals will need special consideration for testing. If sacrificial pads are used, then the pad’s effect on any analog behavior will need to be taken into account.

Advanced packaging design tools and considerations
While standards simplify some of the work for preparing tests, there is still much to consider at design time. Pre-silicon planning and analysis are required to ensure that the post-silicon characterization and testing steps have the best chance at success.

For signals that are not tested in production through sacrificial pads or scan chains, extensive analysis is required to ensure high-quality connections and no electromigration. Output drivers must be analyzed pre-silicon and characterized post-silicon to ensure they’re robust enough to operate reliably in a multi-die package.

For digital test, compressed external vectors are expanded on-die and then generate a signature result that is read out and verified. When preparing single-chip test vectors for a multi-chip test setup, there’s some simple bookkeeping required.

With scan chains, signals in serialized vectors must be positioned so that, once scanned in, all of the signals are in the right position inside the die. By adding other dies to the chain, the length of that chain is now longer, and the signals on one die are now only a part of the chain. So, at the very least, the test vectors must be “padded out” so that all test vectors for each die are scanned into the proper position.

That seems like it should be a simple process, but if that’s the only accommodation for multiple dies, then each die inside the package will be tested by itself while the other dies wait their turn for testing. Test time and cost could be reduced by testing multiple dies at the same time.

At the very least, that requires merging the vectors from the different dies so that all tests end up in the right positions within all the dies. But in this case, one has to pay close attention to power, noise, thermal issues, and anything else that might make the testing unreliable.

Testing often involves the switching of many signals at the same time, so design-time analysis is necessary to make sure the testing of one die doesn’t disturb the simultaneous testing of other dies. “The tools provide information on the tests with the most I/Os switching so that [power/signal-integrity/thermal] analysis can be performed,” said Chickermane. During die design, it may help to play with the clock edges where possible to reduce simultaneous switching.

The test compression used also may matter. “Typically, the architectures of the compression technologies used will dictate whether efficient use of these pattern-porting technologies and top-level resources can be maximized up and down the die stack,” said Adam Cron, principal engineer at Synopsys. “For example, if core-level patterns are being ported to the package top and the compression technology is a streaming-style compression (one requiring continuous data in, while observing continuous data out), then core-level scan ports must be routed directly to top-level resources through pipeline registration. This means that only one core can be tested at a time on a set of top-level scan I/O resources. But a packetized compression scheme can make use of one scan input and one scan output, while testing any number of cores simultaneously.”

Design and DFT tools can help with this process. Some of it has been automated, although the process is still in its infancy, meaning that tools and methodologies are likely to evolve. Some of the companies on the leading edge of this packaging approach have developed internal proprietary ways of doing this. Wider adoption will be aided by opening up these approaches.

One other important consideration is the fact that the different dies in the package may be made by different companies, or their DFT features may originate in different EDA companies with incompatible formats. These are all solvable challenges. There are standard ways of communicating the pinout and test interfaces of different dies. So, even if the design specifics remain proprietary, enough information will be available to integrate them into a unified test.

All of this said, these techniques may not work so well for at-speed testing and for analog signals. “People don’t do 100% at-speed testing,” said Leong. Additional manual intervention will be required to handle those considerations.

Trace redundancy and monitoring
One challenge for manufacturing fine-pitch traces is the yield of the traces themselves. The yields will be high, but if the yield is 99% and there are hundreds of thousands of traces on an interposer, then every interposer will have, on average, 1,000 or more failures.

The solution to this is to provide redundancy, and this must be considered at design time. There are two fundamental approaches to redundancy:

  • Passive, or bump, redundancy. This provides multiple micro-bumps for a single signal, on the idea that, if one micro-bump fails, the others aren’t likely to. Said Igor Elkanovitch, CTO of Global Unichip in a proteanTecs webinar, “A vast majority of micro-bumps address power/ground or low-density signals like serdes or GP [general-purpose] I/Os. It’s our practice to duplicate these micro-bumps, usually three to eight for each signal. So the failure of any micro-bump for power, ground, or signals doesn’t cause the chip to fail.” Noam Brousard, vice president of systems at proteanTecs, noted that passive redundancy is likely not possible for PHY signals, which are packed more closely together. “Using three to eight micro-bumps per signal is valid for a power supply, but it is not applicable in the PHY area because of micro-bump congestion limitation. This a physical limitation that is not related to the [PHY] standard.”
  • Active redundancy. Here a certain number of redundant traces are provided — say, 1 extra lane for each 16 lanes. If one of the standard lanes fails, then the failed signal can be routed to the redundant lane using routing circuitry in the die. In many cases, an entire bank of signals may shift to make this happen. The configuration is then stored in fuses so that the correct routing is achieved for each power-up. “A lot of people put fuses in their chips that are accessible by JTAG,” noted Horner.

In many cases, this happens at final test, and the configuration is set by the tester. But monitoring can provide test-like capabilities as the device operates in its application, and such monitoring can detect the degradation of signals over time. It’s therefore possible that a trace could fail not at test, but later down the line.

Because redundancy activations must be stored for future boot-up configuration, it may be necessary to program fuses in the field — and fuses require higher voltages to program. But as Brousard noted, “Lane repair is a stand-alone mechanism. There is no need for external voltage since the voltage converter is already implemented on-die. There is an option to … store the bad lane in an external memory that is located on the system. In this case, the HBM system will load the lane repair data from the external memory each power up.”

In-circuit monitoring can both augment testing during production as well as maintain an ongoing look at signals once deployed in the field. “We need monitors for the chip/package interface,” said Andy Heinig, group leader for advanced system integration and department head for efficient electronics in Fraunhofer IIS’ Engineering of Adaptive Systems Division. “Copper pillars and C4 bump defects that are present at time zero, but electrically visible, become reliability failures later. The thermal expansion coefficient differences between silicon, copper and C4 result in mechanical stress to that interface. So a defect will change during the lifetime,”

This may be hard to catch. “We need something to help identify cracks after production,” said Heinig. “These cracks in the silicon occur from the die-cutting process. These become worse after thermal cycling and during the lifetime of the product. Small cracks in the silicon become larger and can cause a failure, and I view this as a reliability issue that we need to identify sooner. Low-Κ materials used in advanced process nodes are much more vulnerable to this phenomenon. We see more problems with chips inside the package due to silicon cracking.”

Monitoring can take a wide variety of forms and cover many different parameters. One could, for example, simply look for signal opens, shorts, and bridging. Or one can go further. proteanTecs uses internal monitoring Agents to evaluate the eye diagrams of all signals. If the signal quality starts to degrade, then redundancy can be engaged — even in the field during normal product operation. “We can identify a specific pin that shows marginal performance in mission mode and replace it before it causes system failure,” said Brousard. “We have visibility into degradation per pin and can actually advise what lane to replace with.”

Inspection and traceability
Consideration of the packaging can push back into the silicon processing itself, affecting process control. “Each die must be inspected and tested prior to adding it to the multi-chip package to verify its functionality,” said Vandewalle. “Typical issues can include foreign material, die misalignment during placement, and defects caused by the dicing process.”

Finding these issues is essential. “While certain packaging approaches are gaining more momentum than others, high-precision inspection and metrology is needed for any approach,” said Tim Skunes, vice president of R&D at CyberOptics.

Jim Hoffman, engineering manager at CyberOptics, noted that, “Manufacturers know how warped or deformed a die can be and still mate well with another die. [Inspection can cover] features down to 25 microns, including bump height, ball co-planarity, substrate co-planarity, diameter and shape, relative location, and a variety of other measurements.”

Inspection then becomes yet another part of the economic balancing act. “While adding inspection steps in the process increases the absolute investment, it will reduce the total cost per package, because additional process control will increase the overall yield through the KGD principle, eliminate false rejects, and reduce underkill to avoid customer returns that can result in massive rework and a potential negative impact on the company’s brand,” said Vandewalle,

Some safety-critical applications — notably automotive — require traceability so that if a problem is found in a system after a period of use, the failure can be traced all the way back to the wafer on which a die was built. “What we’re focused on is the ability to track all the parts as they get assembled into a single package,” said Dave Huntley, director of business development for PDF Solutions. Many dice have an ECID (electronic chip ID) that assists with this tracing. The assembly process and test results become part of this trace record. “You need to know every little thing about every little thing that’s been done,” he added.

The SEMI E142 standard ties the position of a die in a package — both x and y locations as well as the z position if stacked — to the x and y locations of that die on its original wafer. This allows wafer test results to be reviewed during any analysis of a field failure — whether or not a chip has an ECID.

In the end, the number of tests and inspections performed will depend on the impact on final yield. It’s an optimization process, and, at least for now, each multi-die package project must determine where the economic balance lies for that project. Over time, best practices and experience will make that simpler.

—Anne Meixner contributed to this story.



1 comments

CHARLES NOLAN says:

With regard to known good die, see expired US patent 4,845,426 on the use of flow vectoring to provide accurate at temperature testing of wafers. Motorola bought the prototype, more than a decade ago.

Leave a Reply


(Note: This name will be displayed publicly)