How neural networks perform defect classification and many other tasks.
We depend, or hope to depend, on machines, especially computers, to do many things, from organizing our photos to parking our cars. Machines are becoming less and less “mechanical” and more and more “intelligent.” Machine learning has become a familiar phrase to many people in advanced manufacturing. The next natural question people may ask is: How do machines learn?
Recognizing diverse objects is a clear indicator of intelligence. Specific to semiconductors, recognizing various types of defects and categorizing them is an important task that initially was carried out solely by humans. Gradually, this classification process was automated by using computer programs running ever-evolving algorithms. Today, most defects are detected and classified by such systems in advanced facilities.
Before machine learning was widely used, there was a period when system set-up was done purely by humans. After learning about situations for a task through observation and experiments, engineers made rules and implemented them as programs for computers to run. In this implementation scenario, the machine does not learn, it just keeps repeating the process programmed, making decisions based on the embedded rules. This is a very labor-intensive approach—to extract the rules from human classifiers, create the programmatic logic to implement these rules, and to verify the result. Sometimes it’s very difficult, or impossible, to translate a decision-making process that humans do, often subconsciously, into computer language.
After machines started to really “learn,” two types of algorithm for the purpose of classification emerged: unsupervised vs. supervised.
Class definition is typically a very subjective process. It’s hard to expect a natural pattern in the data where each cluster, extracted by an unsupervised method, represent a class a human has in mind. As a result, between these two types of methods, supervised classification is the one that is predominantly used today. Among the plethora of such “supervised” methods, algorithms based on neural networks have indisputably formed a super star that has brightened a wide spectrum of application areas.
Most neural networks for classification have a multi-layer feed-forward structure that is trained by back-propagating the error observed at the output. In another words, this is an iterative process with trial-and-error elements including adjustment between each iteration until the desired results are achieved. A signal is fed to the input and pushed through the network to get the output. Because this is a supervised process, the corresponding expected output, or the “label,” of the input is available and compared with the actual output. At the beginning of the training process, the output and the label are usually very different. This difference, or error, is “propagated” backward from the output layer of the network to the input layer. Along the way, adjustments are made to the parameters of the network, mostly in the form of “weights.” Ideally following these adjustments, the same signal goes through the network again and the difference at the output will be smaller. After several iterative processing runs where the same set of labeled samples were used to train the network, hopefully the errors at the output will be minimized or eliminated. When this happens, it means the network will make few or no error in putting an incoming sample into its right class, at least for the samples found in the pool used for training. Assuming the training samples are representative of the general image population, a properly trained machine learning model will produce the right label or classification on samples from outside the training pool as well.
As a summary, machines learn by being trained with high-quality labeled data that embody a good representation of the target classes. Subsequently, what determines the accessibility of a machine learning system is how to make it easy to organize data (defect images, for example), to pre-process them, to label them, to present them in a good format to any network from a diverse gamut, and to have the network trained efficiently and used effectively. Like a string that connects all the jewels into a necklace, automatic defect classification (ADC) has been designed to facilitate all of these steps so that a user can go through the whole length of the defect classification process with ease.
Leave a Reply