AI Models Transform Defect Inspection And Review, But Can Fail To Scale

Majority of AI initiatives failing; synthetic data gaining traction due to limited real-world data.

popularity

Key Takeaways:

  • AI plays a role in improving defect capture rate and distinguishing between yield-killing and nuisance defects.
  • New developments in wafer edge inspection are proving essential to bonded wafer yields.
  • 70% of AI initiatives stall after pilot implementation, but some pitfalls can be avoided.

One of the brightest spots in AI use today is the industry’s ability to better capture the massive number of defect types occurring across hundreds of process steps, from lithography and patterning to assembly of multichip packages.

Engineers are targeting the highest-ROI projects, many of which are centered on yield improvement. Using these models, they can more easily distinguish real from nuisance defects and catch defects that were previously invisible. But scaling AI solutions from the pilot level to factory and enterprise levels is proving particularly tough, requiring higher data quality, stronger correlation, and more robust infrastructure to ensure all data points connect properly.

With AI models, it’s easier to account for new defect mechanisms. “Traditional algorithms are based on statistics, so you gather a lot of data and then, based on the distribution, set the nominal value and tolerance. If a value is outside that range, you call it a defect,” said Charlie Zhu, vice president of R&D at Nordson Test & Inspection. “But often there are multiple parameters in the programming. If you have five parameters and then add one, AI simply does a point-to-point addition where you identify the bad defects, and if you have enough variation in your training data it will learn by itself. Without the AI model, adding that parameter manually is a tedious, time-consuming process that often requires domain expertise to implement.”

Zhu noted that today’s AI algorithms are greatly improved over earlier models. “They are far more robust than the traditional machine vision algorithm, even compared to sophisticated machine learning algorithms like a support vector machine (SVM) or principal component analysis (PCA).”

New processes, such as hybrid bonding, are emerging with their own set of defect signatures, particularly at the wafer edge. “AI is being used to detect and classify residual tungsten CMP defects. These particular tungsten defects were on the edge of the wafer and impossible to detect through traditional techniques, but AI for macro-level defect inspection captures them,” said Errol Akomer, applications director at Microtronic.

The reason wafer edge defects are a challenge goes back to how inspection has been performed on patterned wafers. “Traditional inspection techniques often depend on highly repetitive patterns and complete die references (i.e., die-to-die comparisons), making edge and partial die inspection challenging,” explained Woo Young Han, product marketing director at Onto Innovation.  “Deep learning models can better generalize across incomplete or irregular structures, improving defect detection accuracy in partial die regions for wafer-to-wafer bonding applications.”

It’s important to note that machine learning models are often combined with physics-based models in semiconductor manufacturing environments. “Hybrid ML/physics models are superior to ML only because they are inherently more robust and can extrapolate beyond the training set well (like a physics-based model),” said Nick Keller, senior principal applications engineer, Onto Innovation.

In scalable analytics, one of the keys is the ability to handle “both the measurements, these millions of measurements per die, as well as the volume in high-volume production, which is parts, wafers, or lots,” said Ken Harris, senior director of product management at PDF Solutions, noting that the move to multi-die packaging is further contributing to data overload. “New products are being constructed from chiplets using advanced assembly. This leads to a massive data explosion, and it’s where scalable analytics is really required.”

AI for metrology vs. AI for inspection
AI or ML models have always proven much more useful in defect inspection than in metrology. “AI is more mature and more widely used in inspection compared to metrology for multiple reasons,” said Nordson’s Zhu. “When it comes to detecting voids, particles, mouse bites, bridges, and other issues, AI models shine.”

This goes back to what AI is best at — pattern recognition. How one pattern compares to another determines its classification, which is trained by humans. When provided with enough data, AI excels at defect classification and separating nuisance from actual defects.

Metrology is a physics-based exercise to find ground truth accuracy — or more often, precision. Capturing TSV profiles, thin-film thickness, critical dimensions, or overlay is not about pattern recognition, but about quantifying a feature and making traceable measurements with low measurement error.  Nonetheless, there are cases where metrology is being improved using AI models.

“In metrology, deep learning is increasingly used for signal reconstruction, contour extraction, and process modeling,” said Onto Innovation’s Han. Using inference, the models can boost performance. “AI-based models can infer dimensional measurements, overlay errors, and profile characteristics from noisy or incomplete optical signals, improving measurement precision and throughput. These approaches are especially valuable at advanced nodes where conventional physics-based models become computationally expensive or difficult to scale.”

Part of the reason such models are being employed more today than in the past is due to the speed at which inferences can be run and changes can be implemented on the fab floor. For example, inspection of voids in solder bumps is faster and more consistent using X-ray inspection tools. “Traditionally, we’ve used binarization techniques, for example, setting a contrast tolerance for the black solder, the good areas versus the white areas, the voids,” explained Zhu. “The problem is that the contrast is not always consistent because it depends on how many X-rays penetrate the bump and reach the detector. Traditionally, the operator monitors the SPC and makes manual adjustments. In this case, we’ve consolidated bump data from many customers using a generic model, and a single X-ray image is processed in milliseconds with 90% accuracy. So processing is quick, and they don’t need to collect a large amount of data from their own production environment in order to use our AI.”


Fig. 1: An AI model applied across multiple customers improves X-ray void detection in solder bumps even when customer data is scarce. Source: Nordson Test & Inspection

ADC: Automated Defect Classification
In the semiconductor production line, optical tools often identify defects that are then reviewed using SEM, AFM or other metrology tools. AI and machine learning models can improve the classification (ADC) process. “Deep learning is transforming defect classification and review,” said Onto Innovation’s Han. “Instead of relying on manually engineered features, convolutional neural networks can automatically learn complex defect characteristics from image data, enabling faster and more accurate classification of particles, scratches, voids, pattern collapses, and other defect types. This significantly reduces the manual review effort and accelerates yield learning.”


Fig. 2: Harmless defects often present similarly to actual defects (left images). ML-based automatic defect classification removes nuisance defects (right wafer maps). Source: Onto Innovation

When data is limited
Because wafer processes are so well controlled, there is often insufficient defect volume to run ML models. Fortunately, engineers can create synthetic data.

“In terms of challenges, AI detection and classification requires a sufficient amount of defects that are labeled correctly,” said Microtronic’s Akomer. “Because semiconductor wafer defects are generally rare, they are difficult to gather. Data augmentation (simulation) plays an important role in overcoming the problem of limited real-world defect images and is often essential to train a stable detection model.”

Synthetic data can help companies get a jump-start on inspecting next-node devices, taking new scaling parameters into account. “Generative and simulation-based AI models can create synthetic defect samples that emulate realistic defect behaviors, geometries, and process interactions,” said Onto Innovation’s Han. “These simulated datasets can be used to train and validate inspection systems before sufficient real defect data exists, particularly for new process nodes, advanced packaging technologies, or rare defect mechanisms. By augmenting sparse datasets with realistic synthetic defects, AI helps improve model robustness, inspection sensitivity, and overall defect detection performance.”

Nordson’s Zhu concurs. “Many datasets we got from our customers are biased, which is a particular challenge in semiconductor inspection. Customers have very good processes, so a lot more good data than bad data, and there are many types of defects.”

Zhu emphasized that actual data is best, but when it’s not available, simulation using generative approaches is increasingly being evaluated and applied.

Scaling AI
Data management at the fab or enterprise level is where many companies are struggling today. “The greatest value an analytic provider can offer is the ability to collect, align, and normalize data, and then deploy models wherever they are needed,” said Marc Jacobs, senior director for solutions architecture, fabless solutions at PDF Solutions. “The model itself matters, but the data engineering platform underneath it matters more.”

PDF Solutions estimates that most AI initiatives fail to scale, with over 70% stalling after pilots due to fragmented data, legacy factory systems, limited subject matter expert (SME) bandwidth, and a lack of clear operating models for enterprise AI deployment.

“To execute AI at scale, companies need a blueprint for translating AI ambition into sustained, high-value impact across the semiconductor lifecycle,” said Jon Holt, worldwide fab applications solutions manager at PDF Solutions.

The company suggests eight pillars for AI scale-up:

  1. Physical equipment and sensors that are SEMI standard-compliant;
  2. Fault detection and classification (FDC) and run-to-run control;
  3. Data integration using computer-integrated manufacturing (CIM)/MES, including an AI-ready data repository;
  4. Digital twins for factory planning, scheduling, and dispatching;
  5. Knowledge Hub with role-based access control;
  6. Enterprise AI platform for model training and deployment;
  7. Multi-agent systems with agentic AI and human expert feedback, and
  8. Autonomous engineering with human oversight.

Some of the pitfalls that arise in this organizational process are ensuring the availability of accurate data across the test and inspection supply chain. “When a human analyst is doing exploratory analytics, they can navigate around imperfect metadata through intuition,” Jacobs said. “They notice when data is disjointed, recognize the anomaly, and course correct. That tolerance disappears the moment automation is introduced. If the metadata is not aligned, the downstream operation simply never receives the context it was designed to use.”

Data quality is highest at the point of access, whether that is in the process fab line or at an OSAT assembly line. Cross-referencing incoming data with ground truth in MES or ERP systems enables data augmentation and correction when data is missing or incorrect. For instance, metadata consistency is an ongoing issue. And in today’s world of mergers and acquisitions, companies run into different data standards, lot naming conventions, and identifier/labeling methods.

Conclusion
Companies are at various stages of their data integration and AI model journey, but it’s clear that models are becoming more intuitive, which will help reduce current programming needs.

“For a given patterned die, traditionally we need to ask the model to find features one at a time and train it for TSVs, RDL, or bumps,” said Zhu. “But now, with the more powerful models, we can say ‘locate all the RDL for me,’ or ‘locate all the bumps,’ and it can potentially eliminate a lot of programming needs that we are doing.”

Existing models are particularly good at detecting subtle changes in processes and distinguishing between real and nuisance defects. In outlier identification, AI models can detect subtle deviations from normal process behavior by learning patterns across large volumes of inspection, metrology, and process data. “Unlike conventional threshold-based approaches, machine learning models can identify previously unseen or low-frequency anomalies that may indicate emerging process issues, equipment drift, or latent defects,” said Onto Innovation’s Han. “This capability helps reduce false positives while improving sensitivity to meaningful abnormalities.”

And when the fab infrastructure for connected data is in place, root cause analysis of failures is greatly simplified. Han added, “For yield learning, AI accelerates root-cause analysis by correlating defect patterns, process parameters, tool signatures, and electrical test results across multiple stages of manufacturing. By uncovering hidden relationships within high-dimensional datasets, AI models enable engineers to identify yield-limiting factors more quickly and optimize process conditions with greater precision.”

AI models are in their element with defect inspection because they rely on pattern recognition. However, their use in metrology is still emerging because the output depends on exact values. Even so, to the extent that complex mathematics is involved in certain methods — like rigorous coupled-wave analysis (RCWA) in scatterometry — such models can help accelerate tedious computations. And other projects, such as tool-to-tool and chamber-to-chamber matching, can also be improved using AI models by training on a “golden” tool or chamber.

The larger challenge is not with the glamorous AI or ML models themselves. It is with the much less attractive tedious work of connecting data from one process silo to another, inspection to test to assembly, and even out into the field. When engineers can observe a defect on a new part, rapidly trace its root cause to a specific CMP tool run on a particular date within a reasonable period, and rapidly correct the process problem, AI will truly be showing its promise.

Related Stories

Catching Critical Defects In TSVs And Stacked Chips
Variation is a bigger problem in advanced packages with multiple chiplets; AI can help.

Detecting Chemical Variability At Advanced Nodes
Yield loss is increasingly molecular in origin and invisible to conventional inspection.

Metrology Under Pressure: Detecting Defects In Fine-Pitch Hybrid Bonding
Shrinking interconnects expose limitations in traditional inspection methods, forcing new approaches to overlay, surface quality, and defect detection.



Leave a Reply


(Note: This name will be displayed publicly)