Geo-Spatial Outlier Detection

Using position to find defects on wafers.


Comparing die test results with other die on a wafer helps identify outliers, but combining that data with the exact location of an outlier offers a much deeper understanding of what can go wrong and why.

The main idea in outlier detection is to find something in or on a die that is different from all the other dies on a wafer. Doing this in the context of a die’s neighbor has become easier with state-of-the-art yield and test management data analytic platforms, but it still can get complicated. Even the definition of a neighbor can vary.

Wafer spatial variation has been used for some time to identify and decipher yield issues, but primarily for post-mortems of failures in the field. Product and quality engineers increasingly are applying this pass/fail test decisions, particularly in safety-critical or mission-critical applications, because simple outlier detection techniques based on part average testing (PAT) lack localization to effectively balance the yield/quality/cost triangle.

Techniques beyond PAT rely on multiple measurements, arithmetic operations on test results to focus the pass/fail distinction, and geo-spatial relationships. But geographical location and associated test results also require additional engineering resources, so at advanced nodes and in applications where quality and reliability are essential, costs are going up.

“Each time you make a major shift in technology — such as aluminum-to-copper interconnects, and conventional to high-K metal gates — you have a new set of things crop up that you haven’t seen before,” said Ken Butler, strategic business creation manager at Advantest America. “With each major technology shift and the smaller geometries, it became more and more necessary to move away from simple statistics because they just didn’t cut it anymore.”

Motivated by identifying downstream failures, product engineers have looked to outlier-based detection techniques at wafer test to screen burn-in failures. And when developing screens for test escapes, they naturally have turned to these outlier detection techniques. This is standard practice for automotive chipmakers, which consistently apply outlier test techniques, and they often have pioneered new methods. Now, other industry sectors are following with a more proactive approach to outlier detection techniques.

“While outlier detection may be an effective response to containing a test escape, we see more customers designing outlier detection into their test process earlier in the product lifecycle to achieve higher outgoing quality in many market segments — not just automotive, medical and aerospace,” said Greg Prewitt, director of Exensio solutions at PDF Solutions.

Die position: x, y, and z
With geo-spatial outlier detection techniques, analysis takes place after wafer test because the test results of a die and its neighbors all need to be considered in making the pass/fail decision. This requires additional computation to be performed, and yield/test data analytic solutions support these computations. It may be an IDM’s own system, or a third-party solution that an IDM, foundry, or fabless company would use. Depending on the specific technique, the defined neighborhood varies. Often these make intuitive sense. But the definition of “neighbor” can get complicated at advanced CMOS process nodes, which necessitate a far subtler definition of “neighborhood.”

Despite the industry’s efforts to have uniform device processing across a wafer, the nature of semiconductor manufacturing processes produces geographical patterns that are reflected in multiple manufacturing metrics. Edge dies have lower yield than a die in the center of a wafer. The spinning of photoresist results in radial zones. During the litho process there can be subtle unevenness of the focus across all dies in a multi-die reticle.

With smaller feature sizes, these subtle effects manifest more readily and show up in the patterns of failing die at wafer test and in subsequent manufacturing steps. Engineers now use these patterns to make decisions regarding device pass/fail.

Fig. 1: Concentric circles and radial patterns on a wafer. Source: Semiconductor Engineering/Anne Meixner

In testing for defective devices, the defects often have been classified into random and systematic. The geographical nature of the semiconductor process and increased management of defect density have shifted the focus to systematic defects, with geo-spatial relationships on the wafer and the manufacturing processes playing a role.

“In general, these sources of defects almost never appear in a truly random manner,” said Dirk de Vries, director research and development of silicon lifecycle management at Synopsys. “So process variation from the manufacturing exists, and almost everything has a spatial gradient. That’s true for parametric properties like layer thicknesses or line widths. They typically have a fairly smooth gradients over the wafer, meaning the measured properties of a die will have a certain level of predictive value for the properties of its neighbors on the wafer. You could say, ‘Yes, but they’ve got random defects, and random has no predictive value.’ It’s not quite as simple as that, because if you look at wafer manufacturing defects, there exists a mechanism of producing the defects. For instance, it can be flaking from the wafer edge, or in the source of the plasma. The point is there are sources of defects, and they almost never appear in a truly random manner.”

The geographical patterns can be viewed in data analytic platforms, and they can be used for yield management to identify problematic tools or tool combinations. A series of engineering studies by Intel engineers (1999 to 2005) used wafer test data, x-y location, and electronic chip identification (ECID) to study the relationship between wafer and reliability defect density. Having ECID facilitated data analysis across multiple test steps. This enabled them to find distinct patterns in wafer test results for a die with respect to lot, wafer, x-y location and local region, and its neighbors and the die’s subsequent behavior at final test after the burn-in process. For the local region (a.k.a. neighborhood) they looked at the die within a 5 x 5 region and calculated yield numbers for die marked N, D, T.

Fig. 2: Neighborhood based upon x-y location and N,D, T locations identified. Source: Semiconductor Engineering/Anne Meixner

In their analysis, the Intel engineers noted wafer-to-wafer variation was twice the size of lot-to-lot variation. “Traceability proved to be a powerful tool,” they noted, “for revealing characteristics within wafer patterns of such failures. This was especially true for subtle signals buried within the production burn-in data. Failure analysis established that these sub-populations were invariably differentiated by new systematic failure modes or defect distributions.”

Observed in 1999, this level of localization speaks to the complexity of the CMOS process at 0.25 microns. With today’s advanced process nodes, these systematic failures have increased.

Outlier detection algorithms based upon neighborhood
Localization gets smaller with these outlier detection techniques based upon geo-spatial relationship of a die with regard to its neighbors. Radial position has an impact on defectivity, and thus yield. For localization of decisions of test, the concept of neighborhood has predominated. This is followed by looking at relationship within the z direction.

Localizing dynamic PAT to smaller neighborhoods, like 5 x 5 or 7 x 7, enables engineers to detect the subtle differences from these systematic failure modes. By doing so, engineers can lower the false negatives/positives.

There have been two approaches to comparing a die to its neighbors — good die in a bad neighborhood (GDBN), and bad die in a good neighborhood. Over the past two decades, engineers from multiple companies, including LSI Logic, Intel, and TI, have published case studies that justify these seemingly draconian decisions.

Fig. 3: Good die in a bad neighborhood based upon x-y location. Source: Semiconductor Engineering/Anne Meixner

The GDBN is straight forward, if a die passes all tests yet a number of its neighbors have been marked bad then the good die is now suspect. They can be assigned outlier and then either failed at wafer sort or marked for additional testing that the other good die might not receive.

Fig. 4: Good die in a bad neighborhood. Source: National Instruments

Bad die in a good neighborhood is a confusing term. Technically, it is not a bad die if it passed, but its parametrically different.

“More customers are trying to get a higher-level quality, so they’re looking at outliers with respect to a die’s neighborhood. When you look at all the die around it, they’re expected to parametrically similar,” said Carl Moore, yield management specialist at yieldHUB. “But sometimes a die’s parametric measurement may be off by a couple sigma, maybe still within the overall distribution. ‘Something’s is just not right about this, because it’s showing a different parametric value to everything around it.'”

In addition to looking in the x and y direction for a neighborhood, product engineers can look at specific die location across all wafers. “There are also methods such as ZPAT, where you can analyze in the z axis on a group of wafers. This is very useful in finding defects in masks where a single die may always fail or be an outlier,” said Moore.

Fig. 5: Multiple wafer maps stacked and failures in Z-direction. Source: Galaxy Semiconductor

The defective mask impact on yield seems obvious. The parametric outlier application is localized based on a sample size of 25 die (typically 25 wafers per lot). Note that a 5 x 5 neighborhood in the x-y direction also has 25 die to look for an outlier. Basically, you’re stacking the wafers and looking for patterns in the z direction.

“We had Z-PAT in the marketplace as early as 2005,” said Wes Smith, CEO of Galaxy Semiconductor. “It was developed for a European tier-1 automotive supplier. This company was interested in exploring outlier techniques beyond traditional (even then) DPAT, and they were looking at various geo-spatial relationships, among them Z-PAT.”

Defining an electrical neighbor
The geo-spatial techniques discussed have strictly considered physical relationships. Yet even two decades ago engineering teams pioneering these techniques recognized the systematic nature of CMOS semiconductor manufacturing influences the definition of neighborhood. Instead of doing a physical neighborhood, they recommended selecting a neighborhood based on electrical test data.

“A fixed neighborhood selection such as the eight die closets to the position x, y works well in practice when the data pattern is smoothly varying,” wrote an engineering team from Portland State University and LSI Logic in a 2001 International Test Conference paper. “Smooth contours have been observed many times. However, stepper patterns have also been observed and they are not smooth but systematic. They impose a checkerboard effect across the wafer,”

Fig. 6: Stepping pattern checkerboard effect. Source: Semiconductor Engineering/Anne Meixner

After recognizing this pattern in IDDQ tests, the team recommended using a data driven approach to defining the neighborhood and the limits to be applied, which they called location averaging. They illustrated its effectiveness with IDDQ measurements. Their technique also included using residuals of measurements. Nearest neighbor residual (NNR) combines geo-spatial relationships with an arithmetic modification of the test parameter. NNR essentially defines a neighborhood based upon similar value distributions, and this distribution is not the raw test measurements.

Another data source of a die’s parametric performance is on-die measurements, and these too can be used to define an electric neighborhood within a die. That enables refinement of the neighborhood.

“For the geo-spatial techniques to be effective, there is strong assumption on the process variation in both the x-y and z directions,” said Alex Burlak, vice president of test and analytics at proteanTecs. “In advanced process nodes, process variation within the chip might be significant and enhanced further across ICs (neighborhood) or wafers (z direction), making the geo spatial techniques less effective. Therefore, a more effective technique is to create an expected base line per chip (i.e. taking a ‘personalized medicine’ approach), using machine learning and advanced analytics applied to parametric data generated by on-die universal chip telemetry (UCT) monitors. You can look at it as PAT per chip rather than lot, wafer, neighborhood.”

Adoption of test screens based upon wafer position by the wider product engineering community has significantly increased in the last decade. The availability of third-party yield/test management systems facilitates fabless and small IDM’s to use such techniques.

“The semiconductor industry always emphasizes the importance for quality and reliability of devices,” said Prasad Bachiraju, director for sales and customer solutions at Onto Innovation. “Analytics platforms with supply chain integration infrastructure have enabled fabs to perform rules and statistical die binning based on final test data. Utilizing the wafer context and neighborhood of the die with respect to the source wafer has helped to detect test escapes and improve overall reliability of the chips.”

Geo-spatial based outlier detection techniques enable engineers to localize the game of “one of these things is not like the others.”

“You have one set of performance targets that you want all die to meet, for example 100 micro amps of leakage. But because of spatial variation, they don’t always hit the mark,” said Advantest’s Butler. “So, in test you ask the question ‘When are they different and by how much are they different from nearby die?’ All these techniques we are talking about are premised on wafer position context. That’s why they work so well.”

Related Stories

Chasing Test Escapes In IC Manufacturing

Part Average Tests For Auto ICs Not Good Enough

Using Analytics To Reduce Burn-in

Adaptive Test Gains Ground

Leave a Reply

(Note: This name will be displayed publicly)