Big Payback For Combining Different Types Of Fab Data

But technical, physical and business barriers remain for fully leveraging this data.


Collecting and combining diverse data types from different manufacturing processes can play a significant role in improving semiconductor yield, quality, and reliability, but making that happen requires integrating deep domain expertise from various different process steps and sifting through huge volumes of data scattered across a global supply chain.

The semiconductor manufacturing IC data pipeline continues to inundate data management systems, and not all data is being used effectively. The problem is that without forethought and sufficient planning, it’s nearly impossible to integrate terabytes of diverse types into a cohesive and useful data management management. One size does not fit all, and to be truly useful these systems require domain knowledge as well as good underlying technology.

This explains why engineering teams are moving away from traditional relational databases, replacing them instead with databases that capture the context for collected data. The data analytic platforms reside on top of the data storage and database solutions.

But getting all of this to work properly is harder than it looks. These systems need to be fast enough, and provide sufficient capacity to span the physical distances from access points to where the data is generated. They also need to slice through the business boundaries that exist in today’s semiconductor supply chain, where data access is often restricted and siloed.

The goal is to allow engineers to combine diverse data sources, including design collaterals, equipment sensors, inspection images, electrical test data, and equipment genealogy. With a more integrated data system, engineers can more effectively combine different data types to solve their everyday problems. Ten years ago, tying tool history to either test data or customer returns to any upstream data represented an arduous process for engineering teams. Storing meta-data from measurements provides a context, and this supports engineering in exploring relationships and drilling down into data as needed. Also, with a well-integrated data system and embedded domain expertise, data analytics platforms now can monitor and troubleshoot common manufacturing issues and alert engineering teams with recommended actions.

“Having all data in place allows faster responses to yield excursions,” said Dieter Rathei, CEO of DR Yield. “The time needed to collect, align, and correlate data from different data sources can be significant, and having everything readily available can be a big cost saver when time is critical,”

Connecting the data from the beginning to the end has tremendous value in building a holistic picture, which is critical with increasingly complex designs.

“Without tying data together, a full story of what happens could not be deduced,” said Melvin Lee Wei Heng, applications engineering manager at Onto Innovation. “Just like at a crime scene, you have detectives investigating the overall crime, the forensic agents looking at the trace evidence left at the crime scene and providing that piece of information to the detectives help with the investigation. Similar to semiconductor data sources, every data set holds some ‘evidence’ if something has gone wrong. The goal is to piece all data together to figure out the root cause.”

Data landscape, end-to-end
When a chip design enters the manufacturing pipeline, several layers of data are collected at each processing step. This data falls into two broad categories, measurement data and meta-data. Measurement data can be specific to the wafer or unit being processed at that specific stop in the pipeline. It also can include data from process and test equipment, which generate sensor-based measurements such as pressure, temperature and humidity. Meta-data, meanwhile, provides context for measurements. That can include timestamps, maintenance records, operator names, sources of consumables, tool identifiers, and loadboard/probe card identifiers.

Fig. 1: Manufacturing data stack, and semiconductor manufacturing and and test flow. Source Anne Meixner/Semiconductor Engineering

Fig. 1: Manufacturing data stack, and semiconductor manufacturing and and test flow. Source Anne Meixner/Semiconductor Engineering

Alongside of manufacturing data are a product’s design collaterals. This can include reticles and their associated netlists, library exchange formats (LEFs), Design Exchange Formats (DEFs), as well as package design with bump/pad connectivity between the package and the semiconductor device. Each of these provides valuable information that can be used by yield, quality and failure analysis engineers to drill down into why a die escaped a test step or why there’s been a drop in yield for that week’s production.

Both yield and quality objectives motivate collecting, storing, and connecting data.

Yield management reflects a fab’s successfulness. Is excursion avoidance part of quality management or yield management?” said Lee Wei Heng. “With proper yield management infrastructure to analyze FDC (fault detect classification), defect data, and product test results, proper controls can be implemented to manage excursions. There is a saying, ‘You can control only what you can monitor.'”

Where data resides, how long is it held, and what engineers do with it are essential ingredients in a synergistic relationship involving storage and actions. Some data stays local to a process step, some drives a decision at that process step, some drives an action further down the pipeline. Typically, some data is tucked away in a local database, some is stored up in the manufacturing execution system (MES), and some is stored in a test results or inspection image database.

Data local to fab equipment is used for yield improvement in the form of statistical process control (SPC) monitors. Even if all SPC monitors are working properly, there still can be yield excursions. To understand the root cause, the SPC monitors for each tool can be correlated across multiple process steps. Test data also can be connected back to specific wafer fab equipment to provide additional insight. While this type of analysis always has been possible, creating that feedback loop has been an engineering challenge due to siloed data and the alignment and completeness issues that come with siloed data.

More recently, engineers have started to explore feed-forwarding of data collected during wafer production to influence test processes in order to optimize test manufacturing costs and to improve quality. For example, defect inspection data can predict which die will fail in wafer or final test, and it can be used to determine which die should go through a burn-in step.

Such feed-forward operations are only possible with 100% inspection at critical wafer processing steps. One can foresee the eventual 100% inspection of critical assembly process steps, as well, permitting engineers to predict failures and to guide subsequent test steps.

Domain expertise needed
In the manufacturing of any semiconductor device, all engineers benefit from connecting data from different parts of the manufacturing process and associated product design data. The phrase “end-to-end analytics” conjures a vision of one database system holding all the data, and one model that magically puts all the data together. For most manufacturing engineering needs, this is simply not required.

In day-to-day practice, engineers combine data across the creation boundaries, search for commonalities, patterns and/or correlations, and then dive down further into more localized data as needed. Thus, engineers require these three fundamental capabilities by either internal or third-party data analytic platforms and the associated tools and visualizations.

Job function drives which data to combine, and an engineer’s job function can be more yield focused or quality focused.

“If you work in a wafer manufacturing facility, yield is paramount, and then that is your highest priority,” said Ken Butler, strategic business creation manager at Advantest America. “If you are a product engineer responsible for product delivery and quality for one or more product lines, then you are more likely focused on quality-related applications.”

The table below helps to illustrate where crossed data interactions provide the greatest value to customers in a manufacturing environment.

Fig. 2: Semiconductor manufacturing data combinations ranked by ROI benefit Source: Onto Innovation

Fig. 2: Semiconductor manufacturing data combinations ranked by ROI benefit Source: Onto Innovation

“Identification of the highest value interaction point is going to be based on your job and background,” said Mike McIntyre, director of software product management at Onto Innovation. “For example, the single highest interaction points in terms of value to manufacturing (process engineers, fab management, etc.) occur when users cross material process history (aka WIP or MES data) and material-based metrology, whether that is inline or electrical test. Product designers and device integration engineers would likely state that the combination of electrical performance at device test and the inline metrology results would provide the greatest value. Still other user communities would have their own preference of data combinations that work for them.”

A key engineering challenge to combine diverse data types, such as data from inspection and design masks, is to develop enough engineering expertise to know what matters. Chip manufacturing includes some of the most complex processes in the world, requiring expertise at any of the sub-stops in the manufacturing process pipeline. To connect data from each of these stops, you need some expertise to know what to keep and what to connect.

“One challenge that many companies underestimate is the deep domain knowledge required to make this all work,” said Greg Prewitt, director of Exensio solutions at PDF Solutions. “It is not just a data science project. Without deep semiconductor knowledge, you are only scratching the surface in terms of really leveraging the value of all of this combined data.”

For instance, consider the amount of inspection data to be saved and the way to present it to engineers. “Raw defect data from a fab is voluminous and noisy. One of the key challenges to combining data sources and being successful with machine learning is cleaning and aligning the data,” said Jay Rathert, senior director of strategic collaborations at KLA. “We are focused on extracting every bit of useful information from each defect our tools detect, but presenting that data outwardly in a way that is useful, actionable, and which preserves IP for everyone concerned. It requires deep domain expertise to understand which data matters”

Others agree on the essential requirement of domain expertise when considering which data to combine to gain useful insights.

“Simply stated, the key challenge when trying to combine disparate data is to have the necessary content knowledge and foundational understanding of how these two data types interact with each other,” said Onto’s McIntyre. “Lacking this will create chaos. Combining data types is very much like placing pieces together in a complex puzzle. Certain pieces can’t go directly together and need intermediate data to join them, while others neatly fit together next to each other. Without seeing the big picture, all you have are the pieces.”

Latency, longitude, and latitude challenges
Storing data in context (data plus its meta-data) provides the connective glue for diverse data. Yet prior to connecting data, time span and physical distance challenges must be managed.

First, consider responding to a yield learning from wafer test. Depending on the semiconductor technology, the raw time from wafer start to wafer test can be anywhere from 60 to 120 days. Has the data been kept long enough for engineers to determine a root cause? Not always.

“I can give a counter-example of the costs associated with not having everything integrated,” said DR Yield’s Rathei. “I remember a case where the wafer test data clearly showed a chamber dependency from one particular tool. However, the records of which wafer was processed in which chamber was only stored locally at the tool. To make matters worse, they had been deleted by the time the wafers had arrived at test. So in order to investigate the issue, we had to wait until new wafers — of which the chamber logs were now being recorded — made their way from that process tool to test. A lot of time was lost because data was not comprehensively stored.”

Lost time when diverse data is not appropriately connected is a common theme.

“We see how miscorrelations across test stations and platforms cause product engineers wasted time and resources,” said Nir Server, senior director of product marketing at proteanTecs. “These can stem from issues in design, manufacturing, or the test program itself. Correlating the data across tests enhances the visibility for better debugging, defect detection and tuning, and allows for much finer root causing. Another important factor is the need to outsource and multisource manufacturing capacity, especially in light of recent shortages in the semiconductor industry, and activities across different sites and geographies.”

Physical distance between major stops along the pipeline creates challenges. Each wafer and assembly and test factories can have significantly different longitude and latitude coordinates. Regardless of whether it involves an IDM, foundry, OSAT, or a fabless chipmaker, bridging data across different sites remains a reality that engineering teams need to manage. But the ability to connect data between different factories and associated test equipment has become easier with data management systems and data analytic platforms created internal to an IDM or externally provided by a third part.

Quality issues
Compared with yield, the timeline becomes much longer with quality and reliability. That creates additional challenges for engineering teams to connect diverse data types.

“Quality is a more difficult calculation to make because it is typically based on defective parts per million (DPPM), and even after extensive testing you may not know what your final DPPM rate is until months or years later. So the cost analysis is a much longer ‘loop’ than for yield.” said PDF’s Prewitt

The first stop on the long loop is final test, which takes months. The last stop is an end customer system, which is measured in years.

In the case of test escapes at final test or system test and customer returns, combining different data sources supports engineering teams in root-causing issues, and it guides the subsequent corrective actions as needed. Actions can range from identifying problematic process equipment and associated die to taping out a new mask set to address a design vulnerability.

“This is critical, especially if there is an issue seen at final test (FT) or system-level test (SLT) after chips are placed into expensive packages,” said Guy Cortez, product marketing manager, digital design group at Synopsys. “It is critical to be able to quickly do root cause and see if there is a parameter earlier in the manufacturing process that can predict the issue at FT or SLT. However, tracing back root cause of a design-related issue also requires gaining access to LEF/DEF information to pinpoint where the failure mechanisms within the silicon are located so failure analysis can be done to confirm what really happened.”

Others point to similar concerns. “Data collection, storage, and management are increasingly important to our customers in yield and operations — especially when running volume scan diagnosis to help improve yield,” said Matt Knowles, director of operations product management at Siemens EDA. “Removing any and all barriers to improving systematic yield issues is a challenge that all semiconductor providers face on a daily basis. The software tools we provide (Tessent Diagnosis and YieldInsight) convert manufacturing test data into actionable yield improvements. The tools don’t work without a large supply of volume test data for analysis. The evolution of scan diagnosis from failure analysis guide to root cause monitoring requires a stable, dedicated data pipe from manufacturing test to the fabless customer. After scan diagnosis data results are produced, this time-sensitive data needs be consumed by analytics systems for quick correlation and decision making.”

Yield, quality, reliability, product, design, and failure analysis engineers all benefit from connecting vastly different data types from the manufacturing process and product design collaterals.

While combining different data types is not new, advanced data management, storage, integration and analytic platforms significantly facilitate the connections between all of this data. Combining it into one overall model and system remains a holy grail. But in everyday practice, armed with their domain knowledge, engineers can judiciously combine the data they need to solve the problems at hand. These problems influence an effective design of data systems and platforms, because engineers know which data to connect and which data to keep for months and sometimes years.

Related Stories
Too Much Fab And Test Data, Low Utilization
For now, growth of data collected has outstripped engineers’ ability to analyze it all.

Data Issues Mount In Chip Manufacturing
Master data practices enable product engineers and factory IT engineers to deal with variety of data types and quality.

Enablers And Barriers For Connecting Diverse Data
Integrating multiple types of data is possible in some cases, but it’s still not easy.

Infrastructure Impacts Data Analytics
Gathering manufacturing data is only part of the problem. Effectively managing that data is essential for analysis and the application of machine learning.


Allen Rasafar says:

Thank you Anne,
This is a wonderful insight integrating multiple solutions together for a holistic approach to yield management.

Leave a Reply

(Note: This name will be displayed publicly)