中文 English

Using Sensor Data To Improve Yield And Uptime

Deeper understanding of equipment behavior and market needs will have broad impact across the semiconductor supply chain.


Semiconductor equipment vendors are starting to add more sensors into their tools in an effort to improve fab uptime and wafer yield, and to reduce cost of ownership and chip failures.

Massive amounts of data gleaned from those tools is expected to provide far more detail than in the past about multiple types and sources of variation, including when and where that variation occurred and how, when and why equipment failures occur. Combined with data about device failures in the field, along with such things as design layout and verification, it’s becoming possible to create a detailed timeline of how chips were designed, manufactured and what goes wrong along the way. That, in turn, can be used to improve quality, identify potential sources of defects, and add increased efficiency into processes.

“We have the opportunity to use data from our own laboratories, and in cooperation with our customers, to combine our data with their data to make our tools more reliable and to make the dream of predictive maintenance in semiconductor capital equipment a reality,” said Rick Gottscho, CTO of Lam Research. “It’s been promised and talked about for decades, and you see it happening in other industries. There are reasons why in our industry it may be more of a struggle, but it will happen.”

This amounts to a seismic significant shift in semiconductor manufacturing. Rather than just inspecting or testing die, for example, all of the data from millions of die can be collected and mined for patterns and aberrations. But it also requires a rethinking of how and when various operations are done and why they sometimes result in defects, which can affect long-standing ways of doing things. And while there is a potential windfall on the business side, it could cause some upheaval to get there.

“The way that equipment is operated in the fab, and the way that maintenance is done in the fab, will get disrupted,” Gottscho said. “Customers clearly will put more emphasis on what a company’s data offerings are, what their data strategy is, and how it aligns with their own strategy. This is a big challenge for the industry. You keep hearing that data is everything, and whoever owns the data has a big advantage. That’s somewhat simplistic, but people take that literally so there is a tendency to try to hoard data. That is counterproductive for everyone. We have to find ways to share data in such a way that companies can preserve their competitive advantage. The data are very valuable, but what’s just as valuable is the domain knowledge.  The challenge is conditioning the data and knowing how to filter it, massage it and transform it into something that’s useful for a given application, and that’s all about the domain knowledge.”

So while the upside of gathering data from more places on a continuous basis is significant, it may take some time before the full benefits are realized.

“The equipment has sensors that are analyzing data from operations of the tool and monitoring of the wafer process,” said David Fried, CTO at Coventor. “For instance, sensors and data logs are picking up information about which wafer went to which chamber, where the robot arm is at any point in time, etc. All that data has to go into a system where it can be harvested and analyzed in real-time. And that’s just from one piece of equipment. In a fab you have fleets of that kind of equipment, and then you have all sorts of other equipment and different processes. This is a massive big-data challenge, and what you really want to start doing is learning on that data.”

One of the main targets for this kind of analysis is variation, which is an increasingly problematic issue throughout the supply chain and one that often is hard to pinpoint. Variation always has existed in semiconductor manufacturing, but understanding its sources are becoming more important as tolerances tighten at advanced nodes and in advanced packaging, and as chips increasingly are used in safety-critical markets such as automotive, medical and industrial. The variation can involve everything from which EUV scanner is used to the purity of materials in thin films, or it can be a completely random error. Results also may vary depending upon how and where manufacturing occurs, and how good the controls are across the materials supply chain used to develop those chips.

“All process tools add to the total variation,” said Regina Freed, managing director of semiconductor technology and strategy at Applied Materials. “Key components are critical dimension uniformity (CDU) overlay and line-edge roughness (LER) from lithography, as well as CDU and loading from deposition and etch. Etch and material modification techniques can be used to reduce LER, which is very helpful for customers that move to EUV. But at 3nm this will not be enough to yield. All these variations will translate into an edge-placement error problem. Because we are dealing with both systematic and non-systematic variations, accounting for this in design for manufacturing (DFM) will not be sufficient. That is why we are working very closely with our customers to enable new and innovative patterning schemes, with the goal to eliminate EPE errors using materials engineering.”

These issues are particularly difficult to track down without enough data to build a tighter distribution that shows what causes outlier data points.

“One of the larger obstacles to zero-defect success is the so-called latent defect,” said Rob Cappel, senior director at KLA. “These defects may be of a size or location that does not initially kill the die, or they may lie in an untested area of the die, which is an increasing problem with complex SoCs. As a result, the at-risk die passes electrical test and escapes into the supply chain. The demanding automotive environment of high heat, humidity and vibration can sometimes activate these latent defects, causing a premature failure. The industry has long relied on electrical testing as the method to cull bad die, but latent defects pass electrical testing, so other methods are required to stop escapes near the source where costs are lower. Industry estimates have the cost of an escape increasing 10X for every layer of integration it passes, creating a strong push to find the underlying latent defects in the fab. Variability is a large source of latent reliability defects – especially any type of lithography patterning related variability, such as CD, overlay, line-edge roughness, and localized lithography variability. These sources of variation can and do cause partial voids or bridges, which then can break down in the extreme automotive operating environment. The same can be said of any patterning issues with etch (partial etch) and CMP (CMP dishing).”

The growing value of shared data
Using more sensor data from tools within the fab can make a big difference for many of these issues, but that data also needs to be put into the context of other data, such as chip layout and structures, design and materials. In effect, it has to include a much wider swath of the supply chain. While processes may be mature enough for some devices, new process nodes and newer technologies, such as printed electronics, require more and different kinds of data than has been available in the past. In effect, the industry isn’t standing still, and new tools and data sources need to be developed for everything.

“With metal-composite structures, which may be inks, the failure is often more mechanical,” said Ryan Giedd, director of device engineering and product development at Brewer Science. “The worst problem is long-term drift. There are two parts to that. One involves quality control. Basically, you need to test every sensor the same way at the same time. The second part is understanding how devices perform in the field. Most people we work with are very collaborative because they want to make sure everything works. But we also want to figure out how to make it better, and sometimes that requires us to go back further than anyone wants to address drift issues. Drift can be caused by everything from impurities in the environment to the substrate or barrier layers.”

Data from all of these steps needs to be put in context, as well. Things need to break to understand how and why they break.

“We start with non-destructive testing because we have X-rays, so we can look through modules and components to see if what we believe went wrong did go wrong,” said Gert Jørgensen, senior vice president of sales and marketing for Delta’s ASIC Division. “And then we try to do destructive testing of assembly modules or components to find the root cause. It’s normally different kind of failures. It can be electronic, electrostatic discharge, lightning, too much voltage—the different failure mechanisms that we’ve seen with electronic components.”

Toward predictive analytics
In the past, data about manufacturing and field failures was in short supply for two main reasons. First, some of the large data mining companies scoffed at the amount of data generated by the semiconductor industry, particularly in comparison to data mining in the cloud. And second, fabs were highly secretive about the data they gleaned from their operations for competitive reasons.

Much has changed in the past couple of years, and those changes are continuing to evolve. For one thing, fewer foundries are competing at the leading edge nodes, and their processes are now significantly different. In addition, foundries have recognized that to get high-volume chips out the door, they need to work more closely with their EDA and IP vendors and their top customers. So data has begun flowing far more freely than in the past, and that data stream is going to increase as more sensors are added into tools and even into the chips themselves.

“It’s not just a quality issue,” said Michael Schuldenfrei, corporate technology fellow at Optimal Plus. “It’s also a manufacturing cost issue. You can improve yields and reduce scrap. In automotive, we’re using AI to do inspection. But we’re also using data to predict failures in electronics assembly, particularly around welding.”

The ultimate goal here is predictive analytics, where tools and processes can be adjusted before they cause problems. But that adds another complexity into to the mix. While it’s good to have more data, not all data is good.

“If you don’t have great data, it’s difficult to do predictive analytics,” said David Park, vice president of marketing at PDF Solutions. “It’s not just this industry. With batteries, you may see spikes before you have failures. You may have 50 cycles after that where there aren’t failures, which may tell you that you now have one month. That’s great for a fleet of vehicles because they’re never out of service. But if you don’t have good data, you do need good learning models. We saw one big semiconductor company where the root cause analysis seemed to be a random error. If you can do a multi-variant analysis, though, you can find things a human will never find and prevent failures. What this allows you to do is find a commonality you would never find otherwise because there is too much data. And if you search for this, you may find random failures 8 out of 15 times.”

This is where AI is starting to be leveraged, because it can quickly identify patterns from data that can be trained for specific purposes.

“We’re building AI into our equipment to better handle the billions of bits of data,” said Sanjay Natarajan, corporate vice president at Applied Materials. “Tools have sensors and actuators collecting data and reacting to data, and in most systems that happens algorithmically. Emerging AI capabilities will eventually enable an inferential approach where that data can be used to train and infer, so the tool will adjust accordingly based on the field of data detected.”

That provides visibility into patterns that have emerged before.

“You also can see things coming better than with an algorithmic approach,” said Natarajan. “The sheer volume of training data we can get is incredible. The tools generate this data day in and day out, collecting inputs and outputs that matter, and there’s really no need to understand the dynamics of going from input to output because the data is added into a training engine. Then, this can potentially lead to that information being used in the feed-forward mode to control thickness or the phase of a material, or when to stop the tool. This allows controlling the tool as well as the quality and variation on the wafer.”

That kind of approach also can be used to customize processes for smaller batches and more targeted applications, said Ira Leventhal, vice president of Advantest’s new concept product initiative. “Standard mathematical optimizations result in sub-optimal scheduling,” he said, adding that deep reinforcement learning-based scheduling will be required in the future to utilize this data.

Fig. 1: Deep reinforcement learning. Source: Advantest

More granular, more integrated
The key here is understanding how data applies to a specific slice of manufacturing, which may involve one step in a multi-step operation, and how that impacts other steps. That kind of cross-tool, cross-process data is more of a statement of direction than reality today for a variety of reasons. For one thing, it needs to draw data from new devices that are just being tested in the field. So for the AI system in a driverless car, there is insufficient real-world data that goes beyond today’s simulations and various types of one-off testing approaches.

Going forward, the challenge will be understanding data from devices in operation, particularly with various types of on-chip monitoring of signals and data under a range of use cases, and then looping that data back throughout the supply chain. And all of that data needs to be pared down and understood in the context of how it is being used, which adds yet another level of complexity to all of this.

“In the chip context, there is vastly too much data,” said Rupert Baines, CEO of UltraSoC. “The key is having low-cost smarts and low-cost filtering in order to dramatically reduce the data volumes. If you were to just do dumb sampling of signals with a 2GHz clock and a 64-bit bus, you’re up at 100 gigabits per second for one single trace. So you’re talking terabytes or petabytes very quickly. It’s absolutely essential to have intelligent, local filtering in order to turn screeds of data into high-value, intelligent signals.”

And this is as true on the manufacturing side as it is on the chip level.

“We can harvest every bit that comes off every tool, but then you also want to integrate that with every piece of inline metrology, inline defect inspection classification, inline electrical test, all the way down to full functional test,” said Coventor’s Fried. “So you have all these different data sources. The first class of problem is a big data problem—getting it all into a structure and format that’s usable because we are dealing with a massive amount of data from a massive number of sources with different formats. Solving the format problem doesn’t sound that difficult, but think about a temperature sensor in a deposition tool versus a slurry pH monitor in a slurry tank feeding a CMP tool. It’s a different type of data sampled in a different way using a different set of units. Just putting that into a format where you can operate on the data set is a massive big data problem.”

And that problem needs to be solved in the context of other large amounts of data. So while everyone wants better data—and better insights into that better data—getting there isn’t trivial.

—Susan Rambo contributed to this report.


CH says:

what do you think is the challenge here to supply a tool for predictive maintenance in semi, is it the data analytics capability, the industry knowledge (as you put it the data needs to be analyzed in context) or something else? there are so many big data analytics and sensing companies these days. But I would think semi cap companies should also be cultivating these types of capabilities too

Jnanadarshan Nayak says:

The real challenge is to integrate sensor data across equipment as a wafer passes through and plot the the journey. Then these use cases can be tested.

Leave a Reply

(Note: This name will be displayed publicly)