New Strategies For Interpreting Data Variability

Engineers are using multiple visual and statistical methods to separate anomalies from critical data.


Every measurement counts at the nanoscopic scale of modern semiconductor processes, but with each new process node the number of measurements and the need for accuracy escalate dramatically.

Petabytes of new data are being generated and used in every aspect of the manufacturing process for informed decision-making, process optimization, and the continuous pursuit of quality and yield. Most fabs have turned to machine learning models to handle this increasing data load, because of its ability to spot patterns and anomalies across large data sets. Still, no matter how well those models are tuned, the end results are only as accurate as the quality of data coming in.

“Data is key in all analytics,” says Melvin Lee Wei Heng, field applications senior manager for software at Onto Innovation. “If garbage is put in, garbage will come out. Data set quality will influence how accurate your data analytics results will be.”

The amount, quality, and consistency of data have been topics of concern for years in chip manufacturing. But managing those factors is becoming more complex at new nodes, and more critical as chipmakers seek to leverage that data to improve quality and yield. As a result, the focus has expanded from basic analysis to data management practices that recognize the need to understand and deal with variability in the data, rather than trying to avoid them.

“The aim isn’t to eliminate these variations entirely, but to understand and capitalize on them,” says Michael Keleher, technical sales director at PDF Solutions. “Maintaining data fidelity is pivotal for anticipating issues and ensuring that final products meet the industry’s stringent standards.”

Data accuracy is a nuanced blend of data quality and quantity, as well as its interpretation. Data collection systems in manufacturing environments can capture thousands of readings per second, which is a testament to the high-velocity nature of silicon wafer testing. But as the amount of data increases, so does the challenge of discerning valuable insights. Numerous factors, from sensor quality to analytical techniques like univariate and multivariate analyses, affect the accuracy and relevance of the conclusions drawn.

“You can never make a measurement tool completely accurate,” says Keleher. “Its data is only useful to the bounds of the parameters it’s given. For example, if you have a ruler with markings down to an eighth of an inch, any measurement smaller than that can only be an estimate. The same is true for sensors, test equipment, and model parameters. Part of what we have to account for in the model is the accuracy of the incoming measurement data.”

To ensure data quality, semiconductor fabs increasingly are focusing on robust data management practices. This includes data cleansing and normalization, which play a pivotal role in ensuring the data used for analysis is accurate, consistent, and free of anomalies.

Data normalization
Precision is essential for high-volume data collection, and normalization techniques can help ensure that data sets are uniform and devoid of irregularities that could skew analytical outcomes. By employing methods like outlier detection, missing value imputation, and standardization, fabs can streamline their data sets, which can make them more useful for subsequent analyses.

“Traditional methods use box plots or scatter plots to visualize the abnormal or outlier data points,” says Lee Wei Heng. “Mathematical methods would use Z-score to check how far the data points are from the standard deviation. Outliers should not be removed, unless the reason for a data point being an outlier has an explainable root cause to it. For example, if an outlier data point was due to an error in measurement or a wrong location being measured, it can be removed.”

Data normalization can improve the quality and usability of semiconductor manufacturing data, but fabs must tread carefully, mindful of the potential tradeoffs. The compression or elimination of extreme values and outliers during normalization can significantly impact the integrity of the data set, potentially skewing analytical outcomes and undermining the reliability of subsequent analyses. Consequently, fabs must approach normalization with a discerning eye, weighing the benefits against the potential drawbacks and taking proactive measures to mitigate any adverse effects on data quality and interpretation.

“While normalizing is a crucial step for specific ML algorithms, you first need to determine whether the data variability and scale itself contains information about your target variable,” says Constantinos Xanthopoulos, senior data scientist at Advantest. “This is often done through exploratory data analysis, statistical comparisons, and visual comparisons of the independent and dependent variables.

Fabs also must consider outlier detection techniques tailored to diverse data types, particularly for time series and multi-variate structural data. This is particularly true when identifying outliers — data points that fall outside the expected range. Outliers can indicate a significant shift in process conditions, potentially flagging issues that merit immediate attention, or they can be mere statistical deviations that contribute little to actual process insights.

“For outlier detection techniques, we need different methods for different data types,” notes Keleher. “For time series data (like tool sensor data), there are some anomaly detection methods. The basic idea is to transform the time series and then apply some detection algorithms on the transformed time series. For multi-variate structural data, we also have multi-variate outlier detector algorithms to use, too. The idea is to summarize an ‘outlier score’ for each multi-variate data, and use a threshold to determine if the data is an outlier or not.”

Once outliers are assessed, normalization follows, which includes processing the data to ensure consistency and comparability across various manufacturing stages and equipment. Normalization becomes vital in instances where running every possible test scenario, known as ‘skew corners,’ is unfeasible. This step ensures that each piece of data fits within the established frameworks for analysis, allowing for accurate process verification.

Fig. 1: Outlier detection in a T-squared plot. Source: DRYield

Fig. 1: Outlier detection in a T-squared plot. Source: DR Yield

“Various normalization methods are used when a facility will not be able to run all various skew corners needed to verify windows of a process,” adds Lee Wei Heng. “Some examples are min-max scaling, or linear normalization, used when there is the need to preserve the relationships between data values, and Z-score normalization, used when there are a few outliers in the data set and when data clipping is needed. How the data is to be distributed will also determine the techniques used for normalizing data.”

“Z-score could be our most commonly used normalization,” adds Jian Lu, software engineer at Onto Innovation. “Especially when there are spec limits defined of signal parameters. There may be some other pre-process/normalization for special cases. For example, we may need a Quantile Transformer to map the original values to a more uniform distribution or for categorical signal parameters, we may need some encoder techniques (e.g. OneHot/Ordinal encoder).”

One term that frequently surfaces when discussing data variability is “heteroskedasticity,” which occurs when the expected variance in measurements does not remain consistent across all data.

Consider voltage measurements within a chip, for example. If those measurements are prone to variability, they may display a higher range of fluctuation at either end of the voltage spectrum, which is heteroskedasticity. That inconsistency poses a challenge. It makes it difficult to predict the measurement’s outcome because the model can’t apply a uniform rule across all levels of data.

Recognizing this inconsistency is only the first step. Addressing it requires adaptive modeling, where one could either develop non-linear models specifically tuned to understand and accommodate the variances, or segment the data into ranges and treat each uniquely. Each approach aims to normalize or minimize offsets — the differences from what we expect to measure versus what we actually measure. The value in offsets thus becomes apparent, serving as indicators and guideposts to inform the models about where to focus and what to adjust.

“Transformer and aggregation are the methods we use most often to line up different data sources,” says Onto’s Lu. “For example, we transform time series data to fit the wafer-level metrology, or aggregate event data by time period and match it with other data types like metrology. Data interpolation techniques are used to fill the missing points between two data sources. Also, for high-dimensional data, we need feature selection techniques to reduce the number of signals. Before data analysis, we pick the ‘important’ signals by examining the data instead of using all signals.”

However, offsets aren’t just nuisances to be smoothed out without a second thought. Understanding their origin can be as valuable as the measurement itself. Implementing techniques such as feature engineering, you can take various measurements. For example, if two separate tests on similar devices give different numbers, you can subtract one from the other. The difference is the offset. It’s also another data point that may not be directly useful for the original measurement purpose, but which becomes invaluable in understanding the systems’ behavior.

“We use a “classic” statistical approach here,” says Dieter Rathei, CEO of DR Yield. “Our Event Control Module takes the data of, for example, the 20 lots prior to and following the event and calculates the ANOVA (analysis of variance) statistics. The software flags any statistically significant shifts that are detected. We can use any available data of the lots or wafers for this kind of monitoring. If, for example, end-of-the-line electrical test data are used to monitor a front-end implant operation, there will naturally be a time lag until the warning is raised. However, once the data are available and processed, the software shows you the causal link between the data and the problematic tool at the earliest possible time.”

Further refining detection strategies, statistical tests assess whether new data continues to match established patterns or has branched into unpredictable territories, which is akin to an alarm system that sounds off when parameters exit the safe zone. Novel frameworks, like single-class classification, erect hypothetical boundaries around ‘normal’ operational data. Samples falling outside this boundary are marked as outliers, warranting immediate attention.

In practical terms, imagine having an “anchor” — a measurement or a set of data known for its stability. This anchor can serve as a reference point, allowing you to calibrate your analysis by eliminating these troublesome offsets and aligning the data sets more faithfully to the true values. In the end, it isn’t merely about treating offsets as errors to correct, but as instrumental insights enabling more advanced, precise models that enhance the overall quality and reliability of semiconductor manufacturing processes.

“There is always a state, right after I do my EDA pass, where I have a sit down with the owners of the data — the people that understand data more in terms of what it represents,” adds Xanthopoulos. “I run by them everything that I have, as conclusions from my findings, because a lot of those things might not be accurate. Or you can end up having a lot of models based on spurious correlations, where there’s no underlying causation between your input and target features.”

The overfitting problem
One of the critical shortcomings in present-day machine learning is the tendency for models to overfit. These models, often lacking a self-regulating mechanism, fail to recognize the point at which further refinements cease to add predictive value. Instead, they persist in tweaking and tuning their parameters, striving to sharpen the correspondence between their internal representation and the data they’re fed until an externally imposed boundary, such as a time limit or iteration cap, is reached.

“In most cases, provided that you avoid the curse of dimensionality, integrating several sources often improves the model performance,” says Xanthopoulos. “But there is no free lunch, and that means that with more data sources, EDA, data pre-processing (e.g., feature scaling, outliers screening, etc.) become more complex. This is specifically to avoid overfitting and negatively impacting the model’s robustness.”

In the context of machine learning, when the data pool is relatively shallow, models are prone to over-committing to the specific details within that data. Consequently, the algorithm becomes well-versed in the nuances of the data sample at hand, but loses its capability to apply those learnings more broadly. Instead of uncovering underlying principles that hold across varied data sets, the model simply echoes the peculiarities of its training data. It’s an echo chamber effect, where the algorithm effectively recites from memory rather than flexing its generalization muscle.

The crux of overfitting happens when a model or analysis adapts too closely to a particular set of data, absorbing the noise and random fluctuations as if they were significant. Imagine an overly tailored suit, contoured meticulously to every curve and corner of the form it was cast on. But when the form changes ever so slightly, the fit is no more perfect than before. In semiconductor testing, this translates to a model capturing the peculiarities of the specific data set it was trained on, including its imperfections and transient anomalies, while failing to generalize to new data.

The implications are particularly critical due to the highly detailed nature of semiconductor devices and the precision demanded in their fabrication. Engineers may employ a plethora of parameters to pinpoint the characteristics of a good wafer, yet every addition to the model brings the risk of including irrelevant variables that don’t echo consistent patterns. Should a model be overfit, it becomes less predictive and less reliable when faced with new batches of data. This can manifest in misleading yield predictions or erroneous quality assessments, ultimately impacting the chipmakers’ ability to accurately detect faults or predict failures. Mitigating overfitting is crucial. It often involves pruning the number of variables, applying cross-validation techniques, or using simpler models to ensure that analysis remains robust and predictive across multiple data sets, not just the one it evolved from.

Contextual understanding of how data is acquired becomes imperative for navigating the complexities of data variability, ensuring the insights extracted are both accurate and actionable. Thus, while the proliferation of data presents unparalleled opportunities, maintaining the accuracy and quality of that data remain formidable tasks.

Achieving this consistency requires establishing seamless traceability between the data captured at each insertion and determining whether an aberration is merely an anomaly or a reproducible error that demands attention. The challenge is twofold, ensuring the fidelity of the data across a series of varied yet interconnected measurements, and then applying this pooled information to discern potential faults and their origins.

“For the data coming from either a wafer tester circuit probe or from final tests, we start out with typically univariate analysis of each one of those parameters,” says Keleher. “Looking at that data, in the case of a wafer, is a little bit easier because you can look at the die next to it. In the case of final test, it becomes a little more difficult. If you have the SEMI E142 traceability standard applied, you can look at that data and know, if you’re in a different assembly or even a different assembly line, it’s most likely a different lot. But with traceability, you also can go back and see if you’re getting similar results, because spatially you should be getting similar results.”

Cut through the sea of data and you find a pattern, which is a litmus test for the quality of the process from which it sprang. Whether it’s data accrued from an ultra-precise metrology station upstream, or a package-level test down the line, marrying these data sets reveals a story, although one that isn’t always easy to understand. For instance, two fabrication facilities in the same company can have disparities in testing equipment and recipes. Add to this the diversity in wafer processing, and the complexity of pinpointing data variability’s root causes escalates exponentially. Without a robust data repository, linking discrepancies back to a particular tester or piece of equipment becomes guesswork.

“Collecting and storing all data in one central data warehouse is essential for correlating upstream metrology or sensor data with downstream test results to pinpoint a failure cause,” says DR Yield’s Rathei. “This enables you to monitor each sensor signal and to understand its individual variability. Then we can either look at various signals and see whether a deviation from the normal variability happens at multiple signals at the same time, or from the same wafers being processed, or we apply multivariate methods like Hotelling’s T-squared statistics to find outliers in a multi-dimensional space.”

With enough data, engineers can spot patterns, trace irregularities, and ensure that a tester’s eccentricity doesn’t become a snare for the entire batch. This is where the true mastery in managing data variability becomes evident. It’s all about the interpretation of that variability and the subsequent actions taken, not just the detection.

“Various methods can be used. Simple correlation techniques, one-way ANOVA (analysis of variance) techniques, PLS (partial least squares), or LCA (load case analysis) can all be used to pinpoint a failure cause,” says Lee Wei Heng. “Which technique to use will depend on the problem statement and how the data is structured. In 95% of situations, correlating upstream metrology data to downstream test results would require some data cleansing to be done.”

Prior to modeling, engineers typically analyze visual patterns. “It’s important to begin with a very robust exploratory analysis to understand what your data actually shows, and run through some statistical comparisons before you touch any machine learning model or training,” says Xanthopoulos. “There is no substitute for doing visual comparisons, plotting data, and looking at different patterns that can be identified by comparing the independent variables of a machine learning model’s dependent variable, and seeing whether the offsets are something you care about or something you don’t care about. In other words, is the data variability just noise that can be removed through some technique? Or is it something that actually carries information that is actionable? There is no standard solution for any of that. You have to go through the data.”

AI’s capacity to parse massive data sets for patterns, errors, and data variances is a quantum leap for identifying these abnormalities. Confidence metrics become invaluable here, serving as gauges for reliability. When a voltage is predicted, not just a hypothetical six volts, the model’s confidence in its precision is measured, and with it, the trust the engineers can place in that prediction. Similar to a weather forecast, where certainty varies, confidence metrics enable engineers to estimate the prediction’s solidity.

“A classical approach in machine learning is to divide your data into training data, test data, and validation data,” says Ira Leventhal, vice president of Applied Research and Technology for Advantest. “The validation data is very important because that’s the data you’re using to tune the hyperparameters of your model, and to specifically make sure you’re not overfitting the model based on various biases that may be in the data set.”

Confidence metrics also are paving the way to ensemble modeling, which is a revolutionary methodology for addressing data variability. Rather than placing all bets on one model, ensemble modeling crowd-sources insights from a group of models, each bringing a different perspective to the table. This strategy, which is like getting multiple opinions before making a critical decision, makes AI more resilient. Divergences in these models’ predictions are investigated, not just flagged, creating a stronger basis for decisions. Through this, every prediction becomes a point of collective agreement among multiple AI models, each potentially compensating for others’ weaknesses.

“Let’s say you’re collecting data over time,” Leventhal explains. “As that data is coming out, you expect a certain type of pattern to that data. Then, for some reason, you see a break in that pattern. Maybe it’s nothing. Maybe it’s just an anomaly. Maybe it’s truly random. Or maybe there is something to it. This is where a variety of different unsupervised learning techniques, when applied to the same data you may be using supervised learning on, can tell you things that supervised learning approaches are not telling you because they are only telling you about the things that you were looking for.”

Managing data variability in semiconductor manufacturing is a multi-faceted endeavor, crucial for ensuring product quality, yield optimization, and process efficiency. Through robust data management practices, including data cleansing, normalization, outlier detection, and traceability, fabs strive to harmonize variance and extract actionable insights from vast data sets.

Still, there is room for improvement. The integration of AI/ML holds promise in augmenting traditional approaches, offering new avenues for identifying abnormalities and enhancing predictive capabilities. And as the semiconductor industry continues to push the boundaries of innovation, the mastery of data variability becomes not just a technical imperative but a strategic advantage. By embracing a holistic approach to data management and leveraging cutting-edge technologies, semiconductor manufacturers can unlock the full potential of their data, paving the way for greater insights into the intricacies of their processes.

Related Reading
Strategies For Detecting Sources Of Silent Data Corruption
Manufacturing screening needs improvement, but that won’t solve all problems. SDCs will require tools and methodologies that are much broader and deeper.
AI/ML Challenges In Test And Metrology
New tools are changing the game, but it will take time and collaboration for them to achieve their full potential.
Hidden Costs And Tradeoffs In IC Quality
Why balancing the costs of semiconductor test and reliability is increasingly difficult.

Leave a Reply

(Note: This name will be displayed publicly)