Dealing with a deluge of data in IC inspection, metrology, and test.
Modern inspection, metrology, and test equipment produces a flood of data during the manufacturing and testing of semiconductors. Now the question is what to do with all of that data.
Image resolutions in inspection and metrology have been improving for some time to deal with increased density and smaller features, creating a downstream effect that has largely gone unmanaged. Higher resolutions produce more image data and generate larger file sizes. Whether that data comes from inspection, metrology, or even various test processes, some of it has to be stored and made accessible, while other data can be discarded. It’s up to the end customer, the foundry, or the OSAT to determine what needs to be kept and for how long, but it’s not always clear who is the gatekeeper and who is responsible for maintaining all of that data.
Sorting out these kinds of issues is emerging as a pressing business issue across the supply chain. Data from various manufacturing processes is increasingly valuable. It’s essential for training machine learning algorithms, which will help make sense of the voluminous data generated by various types of equipment, and it can be leveraged to improve various processes and future generations of equipment. In addition, it can be used to increase yield, to improve reliability in the field, and even to make design decisions about which IP blocks, memories, or architectures will create the fewest issues during manufacturing and for specific applications.
Every step in the manufacturing flow is affected by these decisions. Just looking at the data produced through inspection and metrology, the volume has skyrocketed. “Ten years ago it was a couple hundred megabytes, and now it’s somewhere around 10 to 50 gigabytes per wafer,” said Ben Miehack, product manager for defect inspection and metrology at Onto Innovation. “That’s the level of processing, and that’s a function of increasing the resolution. It can be easily modeled. Everybody can model their data set and the number of images and/or content they’re using to provide the measurement, whether it’s inspection or metrology. It’s considerable — maybe 100 times or 1,000 times more data now.”
Other stages in the manufacturing flow report similar increases. “The amount of data is increasing,” said Jens Klatenhoff, vice president and general manager for systems at FormFactor. “Our measurements usually are not really time-dependent, but our customers do want to gather as much data as possible, and this is mass production so it needs to happen quickly. The main goal is to have high-quality data. That relies on automation, which is the loader and the automated features we have within the context intelligence systems.”
Advanced packaging adds even more data, although the amount varies depending on the type of package, the configuration, and the number of components in those packages. Multiple chiplet layers make for more inspection costs. “I don’t think it doubles,” said Miehak. “Maybe you could say it doubles or triples based on the number of inspection layers, and what you’re collecting. But from front end to back end, I don’t think that changes too much. You might not have nearly as much data because you’re not quite at that resolution. In a wafer-level package versus a fan-out chip-scale package, that’s going to be different, but not considerably. The I/O densities may be a direct driver there, more than scaling or the dimensional aspect.”
There are other variables here, as well. Some of this is based on how much can be done in one pass. But single-pass inspection, metrology, and test become increasingly difficult at each new process node and in densely-packed packages, and the amount of data collected can vary greatly by market segment. In a consumer device, there may be less value in testing every die or wafer than for safety- or mission-critical applications, where failures can have disastrous consequences.
“When you do multiple passes, you clearly are sacrificing throughput,” said Subodh Kulkarni, CEO of CyberOptics. “So when we see data that looks marginal to our AI technology, if we have time and a part is still there, we quickly go back and collect high-resolution data while the rest of the data is still being churned. We already take advantage of any slack time in our existing technology. And sophisticated advanced packaging customers are okay with sacrificing the throughput to achieve reliability if we can. That’s where multiple passes make sense.”
But that also increases the amount of data that would be collected on a quicker scan. A single scan with greater resolution increases the volume of data, as well, particularly when that involves multiple dimensions.
“One of the flexibilities about white light is not only do you get the Z [axis] data, but you also get the intensity data,” said Robert Cid, product manager at Bruker Nano Surfaces. “For an application like overlay, where we have two different layers, then what we’re able to do is not only see what the differences are from the top and bottom layer, but we also can judge what the registration is for that overlay. With that one scan, we’re able to get a lot of data. That’s very important for the process engineer to make a determination as to what changes are needed for them to really get very good overlay.”
Fig. 1: Nanometer-level metrology. Source: Bruker
Results depend on the inspection, metrology, and test tools used. Different tools have differing strengths and weakness. For some tools, such as a white light profilometer, repeatability is critical.
“Because it’s used as a very repeatable means of doing measurements in a process environment, the customers are definitely interested in any variability that may come out from wafer to wafer, or from location to location, and here it’s mostly statistics that we’re looking at,” said Cid. “The data of the actual 3D images and information is captured, and if there is further investigation needed, that data is accessible to the customer. What they do with it would really depend on what they’re looking for. But that data is available, and there is a lot of data that could be captured within an environment. They’re basically looking at statistics and trying to understand variability from location to location, from wafer to wafer.”
What can you with the data?
Collecting data from the fab is essential for developing algorithms to feed into AI/ML systems, so they can find defects faster and more cost-effectively. The process of setting up an AI system for a fab, however, is very hands-on. It’s not possible for the AI system to make all the decisions without verifying at least part of its output.
Setting up an AI system to improve yield requires collecting data for the inspection and metrology equipment, cleaning and structuring the data, training the system, and then checking the training against the ground truth data coming from the traditional process. The set-up of an AI system in a fab or assembly house is a process of checking the outcomes of the AI system versus the data from the traditional process.
Once the AI system is working, you still have to be sure the data is solid for the AI system to predict flaws and act on predictions. “You should constantly be validating your system,” said Jeff David, vice president of AI Solutions at PDF Solutions.
No system can be trusted to do everything itself. “Data integrity is one of the main reasons that things can go wrong in your AI system,” said David. “You have to make sure your data is actually there, that it’s not corrupt, and that you don’t have big holes in your data. This is a huge problem a lot of people deal with. And so you should be constantly monitoring some customer sites. They really keep good tabs on what percent of their data is coming in and arriving at the system. At the same time, it needs to make the prediction work. You should always expect some amount of data that’s missing. The question is, what’s palatable in terms of the amount of data that can be missing? And what do you do when it is missing?”
Using data more effectively
Equipment vendors and their customers are looking at what else can that data be used for, as well.
“We’re developing techniques that allow us to utilize data in the fab more for predicting reliability failures in the field,” said Chet Lenox, senior director for industry and customer collaboration at KLA. “In the past, inspection was used for process control only in the fab. So it’s control excursions, figuring out what your defect paretos look like, driving down defects, and keeping CDs in control so parametrics at the end of line are good. The final arbitrator was sort yield in final package test. What we’re seeing now is those things don’t catch latent reliability failures, so a part can pass and a defect mode can activate later. That’s one hole. Another hole is that you have huge test coverage gaps.”
The key is combining and applying all the data gathered in-line so it can be applied throughout the lifetime of a device to fill in those gaps. “It’s not enough to say if a part is good or bad,” Lenox said. “It’s whether it will remain good or bad throughout its lifecycle, which in the case of a car may be 15 or 20 years.”
Silicon lifecycle management is emerging as a significant focus for multiple vertical industries, such as automotive, industrial, medical, as well as for data centers.
“To be able to predict changes in reliability and in performance, you need to have a baseline understandings of electronics and semiconductors, and be able to extrapolate and update those extrapolations with ongoing data so you can maintain an accurate prediction of when things are going to start failing,” said Steve Pateras, senior director of marketing for hardware analytics and test at Synopsys. “The ultimate goal is for the data analytics to be automated. Today you’ll see some human intervention, but with reaction times in microseconds and milliseconds, that has to be automated.”
This requires shifting the starting point for collecting and analyzing the data into the design cycle, and then connecting that with the output from various types of equipment. “It’s actually starting at design and going through manufacturing,” said Pateras. “Ultimately you want to do this in-field, but there’s a lot of value that can be added early on. You start amassing data with the understanding that it can be used in the later stages. As you get to traceability, you can make correlations and develop baselines for prediction if you have the design and manufacturing data analyzed.”
Part of that involves planning for how to collect the data and access it at the appropriate time, which is a challenging coordination effort across multiple manufacturing processes. “For manufacturing test, we really have to think about how the data flows from the tester to the target core of the design, because in current designs we complex hierarchical structures with cores inside of cores, and getting access set up correctly is a challenge,” said Geir Eide, director of DFT product management at Siemens Digital Industries Software. “But getting large amounts of data into the chip and around the chip for test is not just a manufacturing test problem. It’s an in-system test problem, as well.”
This is non-trivial. “If you want to make good choices, you want to take into account things like how much test data you need per core,” said Eide. “So you may want to allocate more test resources to the bigger cores and kind of balance things out. One of the challenges is you have to make these decisions at the same time, but very often you don’t have all the data you need to make those decisions until after you’ve generated patterns, and then it’s too late. So there’s a tradeoff between an efficient implementation, where you do this once and it’s done, and test time, where you might go back and change things as you get more data.”
Another piece of the puzzle is in-circuit data. As with any system, that data needs to be accessible so it can be monitored. This becomes particularly important in advanced packaging and at advanced process nodes, due to density and smaller features. But there is more data to collect and analyze, and someone needs to determine how much data to move versus how much to process locally, and which data to leverage for different steps in a domain-specific process.
Consider traditional oven testing, for example, which is used to determine how a device will behave under controlled thermal conditions.
“You can model the expected degradation in the oven over time, but now you also get actual measurements from the test,” said Uzi Baruch, chief strategy officer at proteanTecs. “That way you can compare what you thought the outcome will be, versus what you’re actually seeing from within the chip. Normally, you would have to run many chips in the oven before you could understand which one would fail and under what circumstances. With on-die monitors, like our Agents, you can see the actual behavior and where the thresholds are. That gives you parametric visibility into degradation trends, not just pass-fail, and much earlier.”
Siemens and Synopsys both have on-chip monitoring capabilities since acquiring UltraSoC and Moortec, respectively. National Instruments bought OptimalPlus for similar reasons. And PDF Solutions has monitors that are used in the fab to view the process with greater granularity, and Advantest has done a strategic investment in PDF to leverage that capability in its test equipment. These approaches can quickly identify problems in early silicon using in-circuit data, but they also can be used to identify future issues, such as circuit aging, unusual electrical activity due to cyber attacks, and latent defects that can cause problem months or years after manufacturing.
What data to keep, and who should keep it
Who ultimately owns all of this data is another issue that needs to be fully worked out, and it has business implications because there are costs for storing and maintaining data. And those costs are growing as chip complexity continues to increase.
“In the case of fan-out processors, it’s overlay-related issues,” said Samuel Lesko, senior applications development manager at Bruker. “You’re sampling a certain number of die over the wafers, but it’s now in the magnitude of 10 or 20 dies. Once the layer is done, the next layer has nothing to do with the rest, so it’s like first-in, first-out data. But you may have to stop the process because the next layer won’t help or won’t be successful.”
Fig. 2: Metrology in advanced packaging. Source: Bruker
All of that data is important in the case of a field failure, but in a complex supply chain some may benefit more than others. The key here is to narrow down what’s essential to keep, and to verify the accuracy of that data.
“A customer would archive data for a specific chiplet where we saw extraordinary offsets or misalignments between two layers,” said Lesko. “But rather than getting pressure on data, we see pressure on the absolute accuracy, tool-to-tool matching, and more frequent and regular quality control that’s assessed against a real baseline, so we are sure whatever data we output on the process wafer, we know the value exactly. It’s all about accuracy and long-term reproducibility.”
Related
AI In Inspection, Metrology, And Test
AI systems are making inroads into IC manufacturing and assembly, but it’s slow going — on purpose.
Hunting For Open Defects In Advanced Packages
No single screening method will show all the possible defects that create opens.
Cloud Vs. On-Premise Analytics
Not all data analytics will move to the cloud, but the very thought of it represents a radical change.
Data Issues Mount In Chip Manufacturing
Master data practices enable product engineers and factory IT engineers to deal with variety of data types and quality.
nice article