Systematic Yield Issues Now Top Priority At Advanced Nodes

Pattern recognition, machine learning, and real-time analysis are needed to root out systematic defects.


Systematic yield issues are supplanting random defects as the dominant concern in semiconductor manufacturing at the most advanced process nodes, requiring more time, effort, and cost to achieve sufficient yield.

Yield is the ultimate hush hush topic in semiconductor manufacturing, but it’s also the most critical because it determines how many chips can be profitably sold. “At older nodes, by the time you would start volume production, the only yield issues you had to worry about were random defects,” said Andrzej Strojwas, CTO of PDF Solutions. “Most of the systematics would be eliminated and the parametric issues would be under control.”

That’s no longer the case. Strojwas explained that systematic defects and parametric variations are now present in early production, requiring aggressive yield management strategies to drive product yield to acceptable levels. “Sometimes there are problems with particular layout patterns,” he said. “For example, on interconnect levels, you may have issues with metal islands that cause catastrophic failures like shorts or opens. Unfortunately, these are happening despite the fact that a lot of attention is paid to optical proximity correction (OPC) to make sure that the structures are printed very close to the design intent.”

Fig. 1: Analysis for to address design systematics for yield ramp. Source: ASMC

Fig. 1: Analysis to address design systematics for yield ramp. Source: ASMC

GlobalFoundries recently implemented a yield improvement methodology during volume ramp of its 14nm finFET mobile products with a two- to three-year lifetime. [1] The company said that systematic defects between the process, design, and layout dominated early yield loss. The process (see figure 1) combines customer GDS files with an adapted design-to-silicon flow that assesses yield loss based on weak points or defects from historic learning. Design features provide input to pattern matching that counts the occurrence of weak points (WPs) or pinch points captured during process development. The design schematics ranks WPs based on yield impact and feeds that back into the NPI process setup. Most critically, comprehensive process characterization on systems are performed during NPI setup using in-line CD, brightfield inspection, e-beam scan, automatic pattern inspection, and process window qualification. Fixes are implemented based on these finding, then validated on wafers.

In one example, GlobalFoundries engineers identified a middle-of-line contact-contact short weak point, “which is related to one 7.5T standard cell with marginal small space between double patterning contact (CA) – CA1 and CA2 with wafer edge die loss (>5%) (see figure 2). Based on the design layout analysis and pattern matching, an optical proximity correction (OPC) fix solution was implemented to widen process margins and boost yield, as shown in figure 2g. “CD check confirmed the full process window.”

Fig. 2: Contact-contact short confirmed by optical (a), layout (b, e), and e-beam (d) was rectified with OPC and validated with larger spacing (f) and wafer edge yield gain (g). Source: IEEE ASMC

Fig. 2: Contact-contact short confirmed by optical (a), layout (b, e), and e-beam (d) was rectified with OPC and validated with larger spacing (f) and wafer edge yield gain (g). Source: IEEE ASMC

3D metrology and yield learning
One of the most significant recent changes has been the increase in 3D inspection and metrology learning. The adoption of finFET and nanosheet transistors is happening on the front end, while advanced packaging requires 3D metrology on the back end.

“The key challenge that we see is really the 3D complexity,” said Shay Wolfling, CTO of Nova. “So this started in 3D NAND with hundreds of layers — and not one deck, but two and three decks. Customers are interested in not just CD, but top CD, middle CD and bottom CD — multiple parameters along the profile. The same is true for logic with stacked nanosheets.”

Three-dimensional measurement and process control of nanosheet transistors, for instance, gave rise to a new tool, a vertical traveling spectrometer, which Wolfling describes as adding interferometry on top of traditional OCD capabilities (scatterometry). The new technique brings additional phase information to reflectivity measurements, which significantly improves the accuracy of cavity and spacing measurements in nanosheet transistors, for instance.

In FEOL, both optical and e-beam techniques are being used. “With gate-all-around at the transistor level, you’re introducing more complexity in front-end-of-line defects,” said Matt Knowles, director of product management for Siemens EDA’s Tessent Group. “Optical inspection is done and e-beam inspection is being applied, but the three-dimensional nature of the gate is introducing significant challenges.”

Specifically, Knowles highlights the increasing impact from layout pattern systematic defects. “This is an area that people have been challenged with for a number of years, but at the advanced nodes, pattern complexity is higher and these defects can make up a few percent of yield. So we’re combining the machine learning from our YieldInsights YMS with PDF Solutions’ pattern engine to solve some of those problems.”

Pattern signatures
During production, engineering teams look for actionable data they can use to quickly improve yield or reduce the impact of yield limiting events. Tool and processing problems often are captured in wafer-level patterns, which can be modeled and automatically identified by software programs.

Machine learning in yield management systems can analyze wafer-level spatial patterns. For example, engineers at Skywater Technology and Onto Innovation implemented an ML-based spatial pattern recognition (SPR) engine to root out systematic yield issues due to process or tool marginalities. The engine proactively generates Paretos of high-impact steps to more efficiently identify causes of yield-limiting events. [2] “The semiconductor industry is not new to adopting SPR,” wrote David Gross of SkyWater Technology (since moved to Siemens Digital Industries Software). “However, efficient use of SPR results to quicken determination of root cause and corrective actions is still a challenge.”

Fig. 3: Results of unknown pattern learning over three months. Source: IEEE ASMC

Fig. 3: Results of unknown pattern learning over three months. Source: IEEE ASMC

The yield improvement methodology begins with ML-based auto discovery of patterns based on months of production data (see figure 3). Engineers add pattern samples to a library, while library and recipe setting combinations are used to improve pattern recognition across multiple products and layers. In production, as wafer inspection, metrology, and probe data are fed into the yield management system, the SPR engine detects and classifies wafers with spatial signatures. Based on the engineer’s monitoring criteria, automatic actions are taken, such as e-mail alerts, automatic reports, etc.

The SPR engine identified three unknown patterns, including lithography striping, two edge bands, and a center cluster. “Litho striping could be due to reticle contamination or an inspection recipe sensitivity issue. However, performing a drill-down analysis such as Repeater and Event Reports isolated the root cause quickly.” Dashboards of yield-limiting patterns in production (see figure 4) were found to increase engineering productivity (+25%) by highlighting process tools that contribute to yield-limiting alarm conditions, enabling rapid reaction and recovery.

Fig. 4: Regular monitoring of defect patterrns using dashboards improved engineering productivity by an estimated 25% by combininng equipment study information with AOI images. Source: IEEE ASMC

Fig. 4: Regular monitoring of defect patterns using dashboards improved engineering productivity by an estimated 25% by combining equipment study information with AOI images. Source: IEEE ASMC

Other case studies by Skywater identified all wafers impacted by a particular spatial pattern, eliminating the need for time-consuming manual classification. The SPR engine also catches known failure modes before full impact, such as etch flakes on the wafer in a pinwheel pattern, whose real-time alarm notifies engineers that a component needs replacement. The team concluded the SPR engine helps outline excursion scope and correlate inline signatures to yield-limiting defects.

Actionable data
Better data analysis programs contribute to greater utilization of defect and fault data. “One of the mega-trends we’re seeing is that in order to have defect data more readily available and actionable, people are running more volume diagnosis in production,” said Knowles. “In the past, only a couple of big companies collected and analyzed all their failure data. Other customers did it on an ad hoc basis for new product introductions or certain yield issues. But now, all customers are being more proactive. They have to be, because they cannot suffer yield excursions going on for a week or longer.”

Defect isolation is an ongoing challenge, particularly when defects are hidden by overlying process layers (see figure 5). Typically, defect inspection tools scan wafers rapidly for defects, then a separate review tool verifies the defect and classifies the defect by type. Failure analysis additionally confirms the failure mode.

Frank Chen, director of applications and product management at Bruker, points out that sampling routines are changing. Old routines are insufficient at catching all killer defects, particularly in automotive and server chips, with sub-ppm acceptable defectivity levels. As a result, companies must invest in tools to drive greater yield. He refers to a paradigm shift from the old sampling strategies that yielded ppm defect rates on a handful of chips to modern automotive systems with thousands of devices with compounded failure rates. “In server units or high-performance AI chips, the return-on-investment is more compelling. But with outsourced manufacturing, anything that impacts productivity, like additional inspection steps, requires a holistic industry solution.”

Fig. 5: 100% X-ray defect inspection (left row of images) is followed by review (middle) and cross-sectional verification (right), to isolate hidden defects in HBM that optical techniques can miss. Source: Bruker

Bruker conducted a study of sampling rates and automated feedback for X-ray inspection and review after die-attach to determine optimal sampling rates, yield gains, and excursion duration (see figure 6). “Our analysis for memory controller failures showed you could dramatically reduce excursion time by going to 30% sampling, which produced about a 7-day excursion. Going all the way to 100% sampling improves yield by 1.7% and shortens the excursion to 2 days.”

Rapid metrology process feedback is key to driving defect rates to the 100 ppb level. “Excursion monitoring with high sampling rate isn’t enough,” Chen said. “You actually need to give active process control feedback to achieve that level of yield and reliability.”

Fig. 6: Sampling rate of 30% reduces excursion time to 7 days. Eventually, sampling rate reaches a point of diminshing returns. Source: Bruker

Fig. 6: Increasing sampling rate to 30% significantly reduces excursion time from many weeks to 7 days. Source: Bruker

One of the ways engineers identify actionable data is through the fault defect classification (FDC) programs. FDC uses sensor data and monitor data on process tools to classify faults automatically. “We’re doing more and more FDC, on process tools and on metrology systems,” said Mike McIntyre, director of software product management at Onto Innovation. “For instance, there’s a best-known method of putting FDC on your metrology tools, which enables fleet matching, so the ability to make sure they’re behaving consistently day to day. And then you don’t have to rely on calibration standards to make sure the metrology measurement is under control. So we’re bringing in that learning, and we can start looking at the inherent signals on the tools.”

But increasingly, the number of sensors on tools is growing, and so is the cost of maintaining and analyzing all the data from process monitors. And there’s a need to track defects from inception throughout the product lifecycle.

Data analytics and lifecycle management
There are two aspects to data management in fabs, and and assembly and testing facilities — the historical data, and the live data produced from everyday operations. “At a macro level, with the advanced nodes, there’s really a step function change in the amount of data that needs to be analyzed,” said Guy Cortez, senior staff product marketing manager of Silicon Lifecycle Management at Synopsys. “The tools have to be able to handle the architecture well enough to perform volume analysis, tracking live and historical data to understand yield issues to empower real time action.”

The data management challenges in semiconductor processing can be simplified, especially when viewed in terms of what really matters — the semiconductor material, or chips. “At Onto we look at data and we basically find it all fits into one of three vectors,” McIntyre said. “It’s either a vector associated with the material that’s being produced, a vector associated with the equipment that’s executing a function on the material, or it’s associated with the process being applied to a tool that’s impacting the material. So all data fits into one of those three buckets. That helps us organize the data for analysis.”

From the design side, memory design and manufacturing have benefited from self-repair mechanisms. Now, to some extent, similar techniques are being applied to logic devices. “In design for manufacturing, we’re expanding BiST into self-repair, as we did for memories, and then for other blocks. Now we’re doing it for interface IP, reconfiguring, calibration, streaming, and so on into field improvement techniques,” said Yervant Zorian, chief architect and fellow at Synopsys. “But linking tools to each other is something newer that we’re doing in lifecycle management, because with the sensor monitors, repairing systems inside the chips can extend all the way to the analytics in the cloud. So we are not looking at the analytics separately from on-chip resources. We are connecting and correlating them with each other, and optimizing.”

Systematic defects clearly dominate at the most recent process nodes, driving the need for more sophisticated yield management programs involving spatial pattern recognition, real-time reporting and defect identification, and automated recommendations for tool problems. Machine learning and data analysis programs are helping to speed root cause analysis in yield ramp at new nodes and during production.


  1. J. Yin, et. al., “Yield Improvement Methodology with Addressing Design Schematics during Production Ramp-up,” 2022 IEEE ASMC, 2022.
  2.  D. Gross, K. Gramling, P. Bachiraju, “Fab Fingerprint for Proactive Yield Management,” 32nd Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), 2021, doi: 10.1109/ASMC51741.2021.9435686.

Leave a Reply

(Note: This name will be displayed publicly)