AI In Inspection, Metrology, And Test

AI systems are making inroads into IC manufacturing and assembly, but it’s slow going — on purpose.


AI/ML is creeping into multiple processes within the fab and packaging houses, although not necessarily for the purpose it was originally intended. The chip industry is just beginning to learn where AI makes sense and where it doesn’t.

In general, AI works best as a tool in the hands of someone with deep domain expertise. AI can do certain things well, particularly when it comes to pattern matching across broad data sets. In areas such as metrology, test and inspection, it’s not as effective as engineers with years of experience. But together, AI plus experienced people often produce better results than each is capable of achieving individually.

“The human eye can see things that no amount of machine learning can,” said Subodh Kulkarni, CEO of CyberOptics. “That’s where some of the sophistication is starting to happen now. Our current systems use a primitive kind of AI technology. Once you look at the image, you can see a problem. And our AI machine doesn’t see that. But then you go to the deep learning kind of algorithms, where you have very serious Ph.D.-level people programming one algorithm for a week, and they can detect all those things. But it takes them a week to program those things, which today is not practical.”

That’s beginning to change. “We’re seeing faster deep-learning algorithms that can be more easily programmed,” Kulkarni said. “But the defects also are getting harder to catch by a machine, so there is still a gap. The biggest bang for the buck is not going to come from improving cameras or projectors or any of the equipment that we use to generate optical images. It’s going to be interpreting optical images.”

That same kind of thinking is starting to be applied across other traditional manufacturing processes. But for metrology, this still requires fine tuning for when and how to use AI/ML.

“You want to be able to control the process as it occurs,” said David Fried, vice president of computational products at Lam Research. “You want real-time active control of the process to get the right results. The only data that you see in real-time is the data that the tool is producing. If you have the right sensors to do the right control, that’s great. But if you don’t, all you have is ex-situ metrology. You’re measuring the wafer after it processes. That’s great information, but unfortunately the process is completed at that point. You can’t go back and control it.”

On the test side, AI/ML is seen as a potential way to improve coverage and improve efficiency. For example, FormFactor is using machine learning. to match designs that are similar for testing. “Machine learning has entered into our MEMS process as well as our design process,” said Alan Liao, director of product marketing at FormFactor. “So each of those chips, and each of those customer-ordered probe cards, are custom made for their particular application or chip. Being able to optimize the probe card to meet their needs is the key to getting high yield on the wafer. We apply a large customer database for that type of application or design, and then they can tap into our library, which is designed to run a machine learning algorithm to find, for example, 10 designs that matched this one. That’s now automatically done behind the scenes with our design tool.”

Where machine learning works, and where it doesn’t
Still, the benefits of using AI/ML everywhere are not always obvious. Some of the new chips and systems being tested have an element of AI/ML in their logic, which makes it more challenging to identify consistent patterns.

“If you think about system-level testing, there’s a custom interface you have to develop to perform system-level tests, but there’s not a lot of algorithmic smarts behind that,” said Keith Schaub, vice president of technology and strategy at Advantest America. “It booted, it ran a bunch of tests on its own, and then it told us, ‘It’s good,’ or ‘It’s not good.’ If now it’s embedded with some sort of AI model, and that AI model changes over time, how do you test for that? So one of the things they want to test for is whether all the high-speed buses are working. They need to make sure the memory is working. They need to make sure the CPU is talking to the memory, and it can transmit video, and save it properly, and that all of this works while a B and C are happening. And they will have pre-loaded on there some specific expectations based on certain percentage accuracies that they expect. And if it passes, that AI is good. If it doesn’t, they will have full traceability through their through their AI vectors.”

Exactly where AI/ML fits into all of these processes, and how to best utilize it, are works in progress.

Some basic uses for machine learning now are:

  • Wafer inspection
  • Defect detection, classification, and prediction
  • Scanning electron microscope (SEM) image de-noising

There are many more in R&D, but how effective they will for the chip industry be remains to be proven over time.

“We’re trying to understand advanced correlations between some of the ex-situ metrology data generated after the process has completed, and results obtained from machine learning and AI algorithms that use data from the sensors and in-process signals,” said Lam’s Fried. “Maybe there’s no reason that the sensor data would correlate or be a good surrogate for the ex-situ metrology data. But with machine learning and AI, we can find hidden signals. We might determine that some sensor in a given chamber, which really shouldn’t have any bearing on the process results, actually is measuring the final results. We’re learning how to interpret the complex signals coming from different sensors so that we can perform real-time in-situ process control, even though on paper we don’t have a closed-form expression explaining why we’d do so.”

Defining the problem
AI and machine learning are often-misused terms, which adds another level of confusion to their adoption. There is a huge amount of hype surrounding these labels, often driven by marketing departments that exaggerate exactly what they’re including in their products.

“Everybody wants to use the phrase machine learning these days,” said Dennis Ciplickas, vice president of advanced solutions at PDF Solutions. “Some people will say machine learning, and all they’re really doing is ax = b under the hood. There’s nothing involving neural networks or AI algorithms, but they’re marketing it as machine learning.”

AI is an umbrella term that includes all the parts that make a system mimic some form of human intelligence. Machine learning and its subsets — neural networks, deep learning neural networks — are part of the AI system. As the brain in the system, the machine learning has to be trained to look at the data and make a classification, decision, recommendation — an inference. Machine learning is dynamic. It also has to be trained to continually make adjustments to its inferencing on its own. The machine can learn without being explicitly programmed.

“If you think about the definition of machine learning, it encompasses a lot of things that people wouldn’t necessarily call machine learning right off top of their head,” said Jeff David, vice president of AI solutions at PDF Solutions. “For example, linear regression. That’s machine learning. I was doing linear regression 20 years ago. I wasn’t calling it machine learning then. That’s a simple example, of course. Machine learning gets much more complicated.”

It also crosses multiple departments. Linear regression is commonly used in machine learning. It also belongs to statistics.

Many of the systems shown on the eBeam Initiative’s recently published list of deep learning and machine learning initiatives in semiconductor manufacturing are using deep convolutional neural networks (DCNNs). ‘Deep’ implies multiple layers. CNNs (convolutional neural networks), RNNs (recurrent neural networks), and generative adversarial neural networks (GANs) also show up on the list.

For example, Hitachi High-Tech Corp. is using DCNNs to enhance image quality to detect defects with high sensitivity in SEM reviews. NuFlare Technology, Inc has a SEM defect classifier that uses a DCNN. Imec improved classification for OPC metrology using an Autoencoder Neural Network deep learning. TASMIT uses an RNN and GANs for semiconductor wafer metrology and inspection system. Siemens EDA uses vector driven neural networks for layout analysis and hotspot detection in its Calibre Wafer Defect Engineering with Deep Learning that speeds up test chip development and improves yield and reliability in the fab by detecting yield limiters.

KLA’s first process control system with AI is the eSL10 e-beam patterned wafer defect inspector. The eSL10 uses deep learning algorithms to find subtle defect signals and pattern and process noise, according to the company.

Fig. 1: Optical critical dimension (OCD) combined with AI physics-based modeling to improve metrology performance. Source: Onto Innovation

Likewise, CyberOptics has been using machine learning for defect detection for wafer-level and advanced packaging inspection and metrology. “Not all machine learning systems are equal,” said Tim Skunes, vice president of R&D at CyberOptics. “You want your machine learning algorithm to be effective. You want to get good performance really quickly. For example, machine learning algorithms such as AI2, where you teach by showing good/defect-free images or images of defects, can improve processes and yields. The operator can quickly teach then monitor, learn from the results and improve and adapt by updating the training sets if required. We design our machine learning algorithms to be biased with the goal of no escapes so no bad product leaves the factory.”

Slow going to get data and inferencing right
To avoid expensive mistakes, the semiconductor equipment industry and its customers are moving cautiously. Building training sets and getting clean data off a variety of equipment is the most important first step to get right before using machine learning on that data.

“Automation can be labor intensive. Today, machine learning algorithms still need to be trained with labeled data,” said Mark Shirey, vice president of marketing and applications at KLA. “With respect to inspection, initially, it takes an investment of time to build up classified defect libraries. However, once this is done the algorithms work well in terms of accuracy and purity. And ultimately, by producing better quality data, they reduce the time needed to identify defect sources and take corrective action. It is exciting to see today’s machine learning algorithms helping to build tomorrow’s AI chips, and we are optimistic that unsupervised machine learning applications will continue to grow throughout the semiconductor ecosystem.”

That requires an understanding of whether the machine learning is working properly, too.

“The industry uses training data and verification data sets,” said Advantest’s Schaub. “The verification data sets are used to sanity check that the ML is working properly. We will likely have to establish a continuous training and monitoring process. Processes drift, which means the data drifts, which means you need to continuously monitor your data and trigger retraining as the process drifts. We know how to do this. The challenge is to know how much to train and how often to re-train. How much drift before I trigger retraining?”

In some cases, building training data and getting clean data have to happen simultaneously in order to make progress. In effect, you need to trust the data being generated.

“The user should always be validating and monitoring what it is that’s going into making those predictions. This leads into the explainability of the model, and why that’s important,” said PDF’s David. “Once you train a model, many times — almost always — you have to understand what it is that model is using to make its decisions. After you train it, certain types of models will tell you certain types of information about what it’s using when it gets predictions. So the user has to look at that and say, ‘Oh, this makes sense.’ There’s got to be some user intuition there. That’s vitally important in a lot of stuff that we do. We actually spend a lot of time making sure that we can explain our predictions, and also tons and tons of drill downs underneath that.”

When it comes down to using models to predict where a failure will be in order to avoid further expensive, unnecessary testing, “it’s so important for our customers to believe that model,” said David. “If they don’t believe it, they’re not going to deploy it and trust it. And if you’re in a situation where you’re constantly retraining your model, it’s going to come up with new inputs that go into that model. The user is going to look at that and make sure they continually agree that that makes sense.”

Because AI-specific chips may be powering the system, continually collecting ground truth is important. “Our deep data approach is to continuously monitor the performance by tracking the timing margins on large number of paths and alert when they become smaller than a pre-defined threshold,” said Nir Sever, senior director of product at proteanTecs. “In addition, we continuously monitor the environment of the chip, including effects of voltage, temperature, power, and clock network integrity, and of course stress.”

Making sure the AI chip that runs the inferencing works correcting is not trivial. “Indeed, this is a real challenge and I think the industry is still struggling to find good test methodologies for these devices,”  said Sever. “A lot of it is done using simulations and the assumption that if your device was manufactured correctly, it will still function correctly over time. Aging can conflict with this assumption, so you need to find ways to monitor timing degradation separately from functional validation.”

Targeting AI/ML
Still, for certain processes in the fab and assembly houses, this technology is proving to be useful.

“AI learns some things really well,” said Mike Kelly, vice president of advanced packaging development and integration at Amkor Technology. “At a very simple level, it’s like least-squares optimization. But it’s statistical. It’s only as good as the statistics and the data set used to drive it. It still looks like it will be a really great optimization tool for managing things like hotspots or phase shift in the clock — things that can be optimized on the fly.”

What isn’t clear yet is where to draw the boundaries.

“System-level design is all about understanding the performance of the individual pieces, understanding the variability of that, what envelope are they in, and making sure they all hook together in a way that’s compatible with the final goal,” PDF’s Ciplickas said. “This determines what you put in the chip, how you measure that, what’s the capability of your measurement tool, and what type of data is produced. And what’s the capability of your analytic system to process all of that to get to the final characterization result? You have to look at every individual component. What does it do? What’s its spec and its variation? And then, do they all fit together to build your big picture?”

And, of course, the more good data, the better the results.

“Often in bigger companies you have a lot of data with failures. You also see failures that occur only one or two times,” said Andy Heinig, head of Department Efficient Electronics at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “To find correlations at this point, you really could use AI. But the problem is you may find very good correlations, but often you don’t find the root cause, because AI doesn’t help you to find the root cause. You only have the correlation. This is helpful, but you need a huge amount of data. Where you have failures, and you have thousands of pictures from the same failure or from the same device, then you see some correlations.”

Help from yield management data dashboards
In the end, it all comes down to clean data. And the more clean the data, the better

“Realistically speaking this is one of the main reasons that things can go wrong in your entire AI system,” said David. “It’s not just the machine learning component. You have got to make sure your data is actually there.”

Dashboards for test data promise to make life easier for yield engineers and project managers if the data is clean enough. These dashboards also could provide a forced clean-data point as they ingest data from all points in manufacturing and test, making actionable machine learning possible.

Some of these involve yield and lifecycle management, which are expected to have increasing amounts of machine learning technology. Synopsys has a flexible analytics dashboard tools that takes all the data from equipment involved in manufacturing and test and presents it visually. This helps because the human brain can spot patterns quickly in visual data.

The use of AI is spreading to other areas, as well. Lavorro Inc. is working on an AI/ML-driven Smart-Bot for semiconductor manufacturing, which uses PDF Solutions’ platform to fill its dashboard with data. Cimetrix Sapience is the common data collection interface and data distribution network that supports the mix of equipment on a factory floor.

Making sense of that data is still challenging, though. Some question whether data coming off semiconductor manufacturing processes will need to be condensed into models.

“Construction of the types of models described requires four things — the ability to handle high-density data, equipment know-how, data science expertise to reduce the massive amounts of data to smaller sets, and software capability to construct models which can be correlated to end of line metrics,” said Jason Shields, vice president of equipment intelligence at Lam Research. “The opportunity of big data and machine learning requires close collaboration between process equipment suppliers and device manufacturers to implement solutions to deliver these results. Process equipment suppliers provide big data management from their equipment, and the know-how to condition data and achieve data reduction. Device manufacturers provide end of line results. The software to implement the correlation is usually provided by the equipment manufacturers, as they will optimize the performance of their solution for their equipment. To date, third-party or customer-driven solutions have not been able to achieve similar levels of performance as equipment supplier models, as they lack sufficiently large equipment data sets or the know-how to effectively reduce the data.”

AI systems using machine learning techniques are being used now in inspection, metrology, and test of semiconductors. In the end, as the chip industry becomes more familiar with how to apply, adjust and maintain machine learning, these intelligent systems likely will help to speed up the inspection, metrology, and test workloads.

“Device manufacturers are increasing their investment in big data management and data science teams to develop in-line models, which assess quality for every wafer,” said Shields. “Process equipment suppliers are collaborating with device manufacturers to enable predictive equipment models that can deliver the required wafer quality at the same or lower risk and cost than traditional metrology approaches.”

Alongside of that, engineers will need to learn how best to utilize and apply this technology. That may prove to be the bigger problem.

— Anne Meixner contributed to this report.

Leave a Reply

(Note: This name will be displayed publicly)