Wafer image interpretation can impact yield and throughput.
Advanced machine learning is beginning to make inroads into yield enhancement methodology as fabs and equipment makers seek to identify defectivity patterns in wafer images with greater accuracy and speed.
Each month a wafer fabrication factory produces tens of millions of wafer-level images from inspection, metrology, and test. Engineers must analyze that data to improve yield and to reject wafers that are not worth processing, and they have relied on computer vision algorithms that support this analysis for decades. But these early implementations of ML are not keeping up with today’s complex chips and rising demand for reliability over longer lifetimes. Misclassification rates are high, resulting in false positives — identifying good wafers as bad — and requiring humans in the loop for final assessment.
That misclassification also slows throughput in the fab, which increases manufacturing costs. In addition, the human review of borderline images results in inconsistent decisions by operators or technicians assigned the disposition task. This is particularly evident as transistor density increases, both horizontally and vertically, creating subtle patterns that may be hard for existing equipment to discern.
To improve accuracy, wafer test maps need to be correctly classified in the context of spatial patterns for different process steps. This, in turn, requires similar computational analysis as inspection wafer images. Improvements in accuracy and speed are proving significant as fab engineers start to leverage state-of-the-art deep learning approaches. Misclassification is declining where those techniques are used, and the need for humans in the loop is shrinking.
A number of presentations at the 2021 Advanced Semiconductor Manufacturing Conference (ASMC 2021) showcased engineers using advanced machine learning, including deep learning techniques, on wafer-level images to rapidly respond to yield-limiting events, as well as to increase product quality and reliability. It helps that these kinds of approaches are more widely available than in the past, and that the underlying compute hardware — GPUs specifically designed for deep learning — are able to process data using massively parallel configurations.
“There’s a whole push to use advanced machine learning for defect classification and wafer disposition,” said Anjaneya Thakar, director of product marketing at Synopsys. “This entire push to use more machine learning is enabled by much better hardware, but also by improved software algorithms. We have had the ability to use image processing and computer vision for these detection and disposition tasks. But advanced machine learning enables finding a new trend.”
Alongside of this, there is also growing demand in some markets — particularly automotive, medical and mil/aero — to weed out latent defects. Here too more advanced methods of analyzing wafer images have seen a rise in adoption.
“Working closely with the leading car manufacturers and their suppliers has sharpened KLA’s focus on Zero Defect, leading to new solutions like I-PAT that help them reach their part-per-billion goals,” said Jay Rathert, senior director of strategic collaborations at KLA. “We expect to keep innovating in this market to enable the connectivity, autonomy and electrification trends driving this growing market.”
Wafer images and computer vision
All of these factors — applications, more complexity and density, and new approaches — add to the time it takes to process a wafer. To manage those costs, engineers use wafer images to identify the sources of low yield. For example, they can use wafer images to proactively scrap wafers, to identify wafers for rework, and to flag problematic equipment.
For the past couple decades, semiconductor manufacturers have relied on computer vision, which is one of the earliest applications of machine learning in semiconductor manufacturing. Referred to as Automated Optical Inspection (AOI), these systems use signal processing algorithms to identify macro and micro physical deformations.
Defect detection provides a feedback loop for fab processing steps. Wafer test results produce bin maps (good or bad die), which also can be analyzed as images. Their data granularity is significantly larger than the pixelated data from an optical inspection tool. Yet test results from wafer maps can match the splatters generated during lithography and scratches produced from handling that AOI systems can miss. Thus, wafer test maps give useful feedback to the fab.
Training a computer vision machine learning model entails three process steps (see figure 1).
Fig. 1: Computer vision machine learning training steps. Source A. Meixner/Semiconductor Engineering
The resulting model is based upon good and bad wafer images. Data pre-processing can enhance an image prior to feature extraction or image labeling. With AOI images, for example, engineers can increase the image quality with filters for further enhancement. In contrast, wafer test map-based images do not benefit from such filtering because each die is simply marked good/bad.
Feature extraction requires engineers to decide which image characteristics the model should consider. With image labeling, for example, engineers can name a spatial pattern for the model to learn.
A machine learning model starts with a training set of image data. The algorithm then needs to be checked to make sure it correctly identifies similar images. With wafer test maps, classification is based on wafer spatial patterns. With AOI wafer images, the focus is on identifying the defects. Both good and defective wafer images are required to train the models.
“When the inspection tool captures the images, is the defect image alone sufficient for understanding the details about the defect? For the computational image classification, there are two approaches — reference-based and non-reference-based,” said Prasad Bachiraju, director for sales and customer solutions at Onto Innovation. “A reference-based approach will provide higher accuracy in classifying the defect than non-referenced-based because the defect image is compared to a reference point on the same wafer. This reduces the challenges that wafer-to-wafer variability, or lot-to-lot variability, can give to defect classification when using non-referenced-based classification. Implementing a referenced-based design is not without challenges. Most systems use a non-referenced approach, so people are now choosing to use deep learning.”
Current AOI systems use conventional computer vision machine learning. Wafers flagged as defective need human review because they generate too many false positives, detecting a defect where there is none. False positive numbers on order of 10% to 15% are not uncommon. The human review is both time-consuming and subjective, and thus error-prone. In a 2007 paper, AMD and Rudolph [now Onto] engineers reported agreement among experienced operators to be 43%, and operator repeatability to be 93%.
AOI systems also cannot find all the defects that fab engineers care about. That is driving the shift toward advanced machine learning techniques to build better detection and classification methods.
“Most systems use a conventional machine learning algorithms,” said Subodh Kulkarni, CEO of CyberOptics. “Once you look at an image, you can see there is a problem with certain areas that conventional ML doesn’t see. When you go to deep learning kinds of algorithms, they can detect all those things. But it takes them a week to program and detect them, and that’s not practical. So a lot of the innovation in our area is starting to happen in machine learning with faster deep-learning algorithms that can be more easily programmed.”
Being successful at this requires a deep understanding of what exactly you’re trying to accomplish with machine learning.
Understanding the problem and whittling down the data needed to solve that problem becomes even more critical with deep learning, but fabs and equipment vendors are making progress. At this year’s ASMC, several engineering teams reported on their successful application of advanced machine learning and deep learning techniques and the necessary engineering work to support the iterative learning processes. While deep learning techniques easily can distinguish between a cat and dog, the large variety of defect patterns and their wide image size range create challenges in the learning process. The same approaches are being used on wafer test bin maps and on wafer inspection images, as well, although their images differ vastly in terms of data granularity.
Machine learning and wafer test maps
Since around the turn of the millennium, engineering teams have used wafer test results to see spatial patterns that enable feedback to problematic equipment and process steps. By applying advanced machine learning methods, patterns can be correctly identified using a library of yield signature patterns from test data. That, in turn, can be looped back to fab equipment.
In their ASMC 2021 paper, GlobalFoundries engineers compared a support vector machine learning (SVM) technique, which is common for computer vision applications, to a 4-level deep convolutional neural network (CNN) for wafer test map classification. Their goal was to increase low yield classification accuracy.
The SVM required feature engineering to facilitate training the model. The CNN required training from existing images in sets, and went through 120 epochs/learning cycles. Using 12 distinct wafer map signatures with 300 to 500 images, each of which was manually labeled by engineers, both models were trained.
Fig. 2: A dozen wafer spatial patterns classification results for SVM (top image) and CNN (bottom image). Source: GlobalFoundries
Averaging the results of correct classification across the 12 distinct signatures, the CNN outperformed the SVM. The SVM solution showed an overall accuracy of 59%, with high sensitivity to pattern position and density and low sensitivity to pattern shape. In contrast the 4-level CNN demonstrated an overall accuracy of 90% and high sensitivity to pattern shape.
Low-yielding wafers have specific spatial patterns that often can be traced back to a specific process step. Combining wafer map pattern classification with a wafer’s equipment genealogy (i.e., the specific equipment that process the wafer) assists engineers/technicians in pinpointing a root cause. But new patterns can emerge with today’s complex systems, even though manufacturing facilities store these wafer map patterns in a pattern detection library.
Proactively detecting previously unknown patterns enables swifter response to process issues. This prompted SkyWater Technology and Onto Innovation to jointly develop a solution. They implemented an inline spatial signature monitoring solution, creating a more systematic way of identifying the 4% of wafers with new spatial pattern groupings — the unknown patterns.
Fig. 3: Data inputs to the spatial recognition engine. Source: Onto Innovation
“We started by adopting machine learning techniques to perform auto-discovery on these unknown patterns,” wrote SkyWater’s David Gross and Katherine Gramling, and Onto’s Prasad Bachiraju, in their ASMC 2021 paper. “This auto-discovery process generates a pattern pareto report by grouping wafers with similar patterns based on hundreds of feature vectors generated by the SPR Engine. As a result, we end up with top-n, high-impacting, auto-discovered patterns to help us understand patterns that are new, starting to emerge, or going unnoticed. This process helped us to efficiently maintain a comprehensive pattern library that enables proactive response to production issues.”
Machine learning and wafer AOI images
This is a non-trivial endeavor. Successfully implementing deep learning models on AOI images requires domain expertise in the actual images (i.e. looking for discoloration or pattern shapes) and expertise in developing machine learning algorithms. Wafer image data presents unique image detection and classification challenges due to the wide range of defect sizes and the large variety of image classifications. To train advanced machine learning models, hundreds of thousand of images eventually are used. The results are then checked between learning cycles by engineers/technicians, who comprehend the image process and defects being detected.
In two ASMC 2021 papers, the authors described in detail the upfront investments to create their models. In both cases, though, those investments proved worthwhile. The resulting models significantly improved detection and classification.
A GlobalFoundries engineering team shared their results from applying advanced machine learning during the lithography processes. For inline control of lithography processes, fabs use AOI after photo resistive development (also called after develop inspect, or ADI) to detect spot defects and coating defects. Once detected, coating defects can be fixed by removing all resist from the affected wafer and repeating the lithography step prior to etch. If missed, the yield impact is quite evident at wafer test.
Fig. 4: Wafer test result maps (green is good) depicting patterns associated with photoresist development defects. Source GlobalFoundries
With a 100% inspection, ADI looks for macro-level changes at the resolution of greater than 30 microns. Inspection recipes for this detection rely upon color differences, but these lack sensitivity for faint colors. While commercially available computer vision ML models can be trained and adjusted to increase sensitivity and selectivity, they have a high false positive rate.
GlobalFoundries developed a new approach to increase detection of faint images and reduce the false positive results. First, it used image equalization to increase visibility of faint defect regions.
Fig. 5: ADI image of faint coating defect a) original image, b) after image equalization. Source: GlobalFoundries
Next, it used advanced machine learning and an explainable artificial intelligence approach in between the learning cycles. This provided essential insights as to why initial prediction results failed for images at the wafer edge, including both false positives and false negatives.
“Our attention was drawn to the dimension of our ADI images and how the ML system managed those dimensions,” GlobalFoundries researchers wrote. “Investigating the image dimensions revealed the ADI delivered a wide range of image dimensions (with many as high as 4,256, but most below 2,240 pixels); and that the ML system used crops images to a max x or y dimension of 2,240 pixels. This cropping was a problem, because images in the ML training set sent to the model for a prediction could have the defect removed from the image if the dimension was too large and/or the defect was close to the edge.”
They fixed the image cropping by proportionally scaling the dimensions of all images to the max x/y dimensions of 2,240 pixels.
Wafer image classification and more advanced machine learning algorithms are not relegated to just sub-micron process technologies, though. Engineers from Hitachi ABB Power Grids also shared their efforts at ASMC 2021. Likewise driven to reduce false positives and increase detection, they developed a sophisticated deep learning approach for defects found in wafer images from five different power devices, including bipolar IGBTs and power diodes for high voltage applications (1.2kV to 6.5 kV).
Due to the range of defect types to detect as well as the rare/unique occurrences of some defect types they chose to use an object detection approach instead of an image classification approach. Defect image sizes they needed to detect ranged from tens to several hundred thousand pixels. The former present great detection challenges due to their size- 0.01% to 0.1% of the total image. The very large defects exceeded the image size leading to image cropping by the AOI tool.
By selecting smaller images for analysis, they reported that the model more easily learned the image backgrounds, which reduced false positives. With their object detection approach, they combined region-based CNNs with active learning and transfer learning to enable detection of the small defects using training sets of only 500 to 2,500 examples. After 6 learning cycles and a total of 2,431 training examples, the classification results had a precision of 0.98 and a recall of 0.80. Precision equals the ratio of true positives over predicted positives, while recall equals the ratio of true positive over real positives.
More to come
Using wafer inspection images or wafer bin maps, the engineering goals fundamentally revolve around interpreting the wafer images to take proactive actions that improve yield and quality.
“The amount of data that you’re getting back from inspection and measurement tools in the fab is massive. Therefore, you need machine learning techniques to go through that data, find a trend and raise the flag if there is a problem,” said Synopsys’s Thakar. “Images become a great training tool for these models. Machine learning addresses a very critical problem in silicon manufacturing: How to look at all this data and figure out what are yield effects versus what is not. Answering this question pushes the use of machine learning and deep learning to perform the defect analysis.”
Conclusion
Fab engineering teams are adopting more complex machine learning algorithms for wafer image review because these methods achieve better classification and detection metrics. Computational hardware designed to accelerate using neural networks, open-source image libraries, and increased experience with CNNs, in general, contribute to their adopting these methods.
Expect to see fab engineering teams developing advanced machine learning models in the future.
Related Stories
Leave a Reply