Too Much Fab And Test Data, Low Utilization

For now, growth of data collected has outstripped engineers’ ability to analyze it all.

popularity

Can there be such a thing as too much data in the semiconductor and electronics manufacturing process? The answer is, it depends.

An estimated 80% or more of the data collected across the semiconductor supply chain is never looked at, from design to manufacturing and out into the field. While this may be surprising, there are some good reasons:

  • Engineers only look at data necessary to solve a specific problem, rather than all the data being collected. This is especially relevant with “excursions,” because engineers want to quickly pinpoint the cause of a problem and fix it.
  • Once semiconductor processes are stable, or products are mature, there is no reason to review all of the data.
  • Contractual obligations, driven by military and automotive companies, require archiving of data for 10 to 15 years, most of which is never touched again.

There also are some not-so-good reasons why the bulk of data is ignored. Some of it is dirty, some lacks traceability and context, and there may be questions about a data source’s provenance. On top of that, there is simply too much data to digest. The chip world is drowning in it, and the volume continues to balloon.

“Some 90% of the available data is less than 2 years old, meaning there is suddenly a lot of data, and it showed up suddenly — one might say overnight,” said Keith Schaub, vice president of technology and strategy for Advantest America. “Data trafficking, storage, and access are all having to change to deal with the continually expanding massive data tsunami.”

Semiconductor manufacturing IT professionals are witnessing a sudden explosion of data across all product sectors. The industry is grappling with the management of that data, and at the same time pushing to collect more data because it could be useful. Consider that inspection and metrology steps that used to be sampled based only have shifted to 100% sampling.

“If you’re only sampling, you can only really operate off the first or second bar your pareto chart and you’re not going to understand your failure rates,” said Anna-Katrina Shedletsky, CEO of Instrumental. “I’m a big believer that if you aren’t measuring it, you can’t move it. So if you’re not measuring it, that means you don’t care about that particular item.”

Having lots of data — both breadth and depth — does enable engineers to perform more sophisticated statistical analytics, which in turn can be used to unravel unusual test signals and to balance the quality, yield and cost. In his 2019 keynote at the International Test Conference, Mike Campbell, senior vice president of engineering at Qualcomm, noted that applying advanced data analytic methods reduced the company’s virtual metrology error and resulted in more than a 10X improvement.

But collecting, storing and connecting that data becomes non-trivial with petabytes of data to analyze.

“Today, having the automation/intelligence to sift through these terabytes of data for signals is both a blessing and a curse,” said Mike McIntyre, director of software product management at Onto Innovation. “Consider that 10 to 15 years ago engineers had to deal with a dozen or so possible signals when problem-solving, most of which they had a part in generating. Today, there are multiple thousands of signals that must be evaluated for significance and contribution, which had not required human involvement in generating and which ultimately should be evaluated by the engineer.”

This is made more difficult by the fact that the semiconductor industry’s data ecosystem is fraught with fragmented data due to business silos, woeful data governance, and just not knowing what’s valuable.

“In some cases, we don’t really understand what data is valuable. So we collect everything, and then we barely use it,” said Stacy Ajouri, senior member of the technical staff at Texas Instruments and co-chair of the RITdb Taskforce. “That’s a real challenge, because there’s just so much of it. Part of the problem with being able to understand what we should keep is the fact that we just don’t have traceability of the data.”

Traceability is critical when it comes to understanding the value of data. “Having traceability available is important so you know exactly where your failing parts are coming from,” said Andre van de Geijn, business development manager at yieldHUB. “You want to be able to map them into the wafer. Then you can correlate between equipment on a particular test value, which can provide insights on equipment and final product behavior.”

Still there is lots of data to manage, and there is a cost to moving, cleaning, structuring and storing that data. With dirty or unconnected data, there is only so much engineers can do.

Depth and breadth
So where did this explosion of data come from?

The answer is complicated. It starts with the fact that there are more chips everywhere, and each has its own complicated history. ICs are the engines in many of the products we use in our everyday lives. They drive the Internet of Things and edge networks that power our factories and cities, and they control the energy usage in our buildings and in our smart phones.

But that alone does not explain the explosion in manufacturing and test data. Large digital dies naturally generate more test data, but test compression techniques evolved more than 20 years ago to deal with that problem. So this is just a fraction of the problem. There are other sources, as well.

Starting at the turn of the Millennium, chipmakers began asking manufacturing equipment companies to provide more data. Two decades later, there has been a significant increase in the number of sensors now included in that equipment, and sensors are being added just about everywhere. Those sensors generate huge volumes of data, which in turn contributes to the flood of data, along with the subsequent data storage and retrieval problems.

Consider fab equipment sensor trace data, for example. “It is sampled in up to 50 KHz for a litho tool,” said Jon Holt, senior director of fab applications at PDF Solutions. “Assuming a typical 10Hz sample rate, the data adds up quickly when you consider 24 x 7 operation, with an average 500 tools for 100K wafers a quarter. It goes into the petabyte range, and it becomes very expensive to store and manage.”

Fig. 1: Where data is generated in a 300mm fab over the course of a minute. Source: PDF Solutions

Fig. 1: Where data is generated in a 300mm fab over the course of a minute. Source: PDF Solutions

Wafer inspection contributes a significant amount of data to the data pool. Striving for the zero-escape metric, fabs have begun adding 100% wafer scan inspection for several high-risk reliability and impactful layers, with the goal of identifying outlier wafers/dies that could impact customer systems.

“If you’re sub-sampling the wafers, there’s no way to trace a field reliability issue back to a manufacturing step, because chances are that die never went through any sort of inspection tool,” said Doug Sutherland, principal scientist in technical marketing at KLA. “It’s only if you have these screening steps in place, where you inspect 100% of the wafers and 100% of the die, that you’re able to prevent escapes. You need to be able to trace failures back to an actual problem that existed in the fab.”

The move to 100% inspection, enabled by faster scanning technology, also results in more images being stored. These images can be used to analyze for issues and cross correlated with other types of data like electrical test data.

On the test side, Qualcomm’s Campbell said that roughly 2TB of data were generated per day in 2019. That number has increased since then. The volume of test data jumped in the move from 10nm to 7nm, and it will increase again at each new node. This is partly due to more chips per wafer, but it also is due to more transistors per chip, increased design complexity, more process corners, as well as new processes, tests, diagnostics, and more complex assembly.

Fig. 2: Cumulative data volume over time. Source: Qualcomm/ITC 2019 presentation

Fig. 2: Cumulative data volume over time. Source: Qualcomm/ITC 2019 presentation

Even the floating-point number portion of test is generating more data.

“The volume and complexity of data has seen exponential growth,” said Wes Smith, CEO of Galaxy Semiconductor. “We recently created a simple chart showing the data volume for our highest-data-volume customers, and we see a fairly consistent exponential growth line in which data volume grows by an order of magnitude approximately every 18 to 24 months.”

Others report similar jumps in data. “When I started in 2012, per-wafer test results, files were hundreds of megabytes in size,” said David Partyka, manager of software applications at GlobalFoundries. “Now, for certain customers, the file size is 15 to 16 gigabytes per wafer, which makes it a nearly a quarter of a terabyte of data per wafer lot (25 wafers). Data volumes are massive, especially for brand new products.”

Part of this rapid growth in test data volume stems from the need to combine engineering data with production data in order to improve yield. GlobalFoundries, for example, moved from deep, high-volume characterization data only on the engineering floor, to integrating that data into the production flow for initial wafers and bring-up. This has happened in stages over time.

“I don’t know the date,” said John Carulli, DMTS director for Fab 8-Test at GlobalFoundries. “All I can tell you is why — because it was that important to do so.”

It’s important to have data for yield learning, and the amount of data required for that jumped dramatically between 22nm, pre-finFET designs, and 14nm finFET designs. According to one industry insider, this necessitated moving engineering characterization test programs into production test environments. Stopping on first fail, which is used in production test environments, limits an engineering team’s ability to learn for yield improvement. In contrast, maintaining an engineering characterization program on the production floor permits collecting data on bad die to learn about which part of the die is passing and which part is failing.

Challenges in using all of this data
But with petabytes of data each day, engineers cannot possibly sift through all of it, even with more automated analysis.

“Companies are still not analyzing all the data they are collecting,” said Greg Prewitt, director of Exensio solutions at PDF Solutions. “Up to 85% of data remains ‘dark data,’ which is data that is never even used in analytics, but is saved anyway.”

The hope is that tools will be able to sift through more of that data in the future to find useful nuggets or trends. “As part of the journey, we started pushing a lot of analytic data and analytic needed data into the cloud,” said Sunil Narayanan, senior director of applied intelligence solutions at GlobalFoundries. “We noticed that we are pushing around 3-plus terabytes of data per day. Out of the 3 terabytes per day, it looks like we use only 20% or less of that data for analysis (still in early stages of the project). In the next one to two years, when we have more tools, it might grow.”

To make sense of large complex manufacturing processes, the data no longer can be analyzed without context and traceability. Analytical tools can process such data structures, which capture the manufacturing metadata to explain the conditions under which data was collected, as well as its genealogy or manufacturing equipment fingerprint.

This is easier said than done, of course. For longitudinal data analysis it requires diligence to have that salient metadata stored and correct. Today, this takes a concerted effort, which helps explain why only 20% of the data is looked at.

“What used to happen is people would just parse the data and then load it into a database as raw data,” said Paul Simon director of analytics for silicon lifecycle management at Synopsys. “This is no longer sufficient because operational changes (i.e. test equipment) during manufacturing impact your understanding. When you do analytics, you want to trace and understand those changes, and have these changes linked to the collected data. For example, to do correlations from final tests to wafer sort over time, those operational changes happen. It’s very difficult thing to have stored correctly.”

In general, engineers want more data, but it does them no good if it’s missing or incorrect due to data interruption or corruption.

TI’s Ajouri expands on these specific challenges. “Just the quality of the information is an issue — the ability to ensure that it’s accurate and complete. The test data files are getting big with these old formats. In addition, these old formats aren’t really handling everything we need anymore.”

Addressing the quality of data also requires aligning the data. Standards for fab, assembly and test equipment can help with the alignment. “Today, we have incompatible data systems using proprietary or ad-hoc data formats,” said Advantest’s Schaub. “There’s a lot of integration work underway to better align these systems, which will dramatically increase the value of the data, as well.”

Melvin Lee Wei Heng, field application and customer support manager for software at Onto Innovation, agreed. “Standard data formats for back-end tools are still a challenge at this time, and many new tool vendors do not have structured data exports even today. Many times, custom data loaders must be written. In some situations the tool even needs a software update for the export of the right content for yield analytics,” he said.

Connecting the dots
Despite the data management challenges, semiconductor companies throughout the supply chain see opportunities in connecting data from multiple sources to make manufacturing decisions.

Some of this requires forethought in targeting the right data. On-chip circuitry holds promise in this space.

“With this explosion of data, we need to think about the use case,” said Shai Eisen, vice president of product marketing at proteanTecs. “The challenge is finding the right data points that are relevant to that use case, and finding the correlation between those data points and other unrelated data types. When you design the data to address the specific use cases, then the data is designed to generate the information you need for that specific use case.”

Such consideration needs to be dealt with up front when designing a complex IC device, and the domain knowledge is often essential. Engineers need to stay in the loop when moving to all the data generated in manufacturing, which is particularly important in reaping the full promise of smart manufacturing/Industry 4.0. Statistical-based analytic models require large amounts of data across multiple sources to predict with a refinement that balances the seesawing yield quality of production test. In addition, there’s value in longitudinal historical data to assist with decisions at the edge, as well as assisting with decisions made locally in a factory.

To more easily connect the dots between the data from multiple data sources also requires data sharing standards or/and a platform for the analytics.

“A lot of equipment companies want to provide intelligence at the tool level,” said John Kibarian, president and CEO of PDF Solutions. “They need a platform for all of that data. The problem is that the industry doesn’t need everyone using their own platform with their own data collection techniques.”

But collecting, sorting, and structuring data in a consistent and meaningful has been somewhat disorganized, in part because no one considered how large scale statistical analysis or machine learning (ML) would be able to utilize this data when they first started adding sensors.

Conclusion
If today only 10% to 20% of collected data is being looked at, the industry needs to seriously get a handle on why that is the case. There are legitimate reasons to not look at all the collected data, including historical data going back 15 years or more.

Nevertheless, the chip industry stands at the precipice of machine learning’s ability to recognize a subtle shift in pattern in petabyte-sized datasets, and just having more data isn’t necessarily a sign of progress. “There’s a lot of data, but just because there’s a lot of data doesn’t mean it’s valuable data or at the right resolution,” said Instrumental’s Shedletsky. “Frankly, we believe there is such a thing as too much data.”



7 comments

Peter Andrews says:

Thanks Anne for the very inciteful article. My customers are often requiring equipment with faster test throughput, in order to ‘keep up’ with the increasing data collection, due to new or widening test conditions… It seems like companies doing ML trend analysis, processing all that extra data, need to be growing in parallel to testing companies.

Anne Meixner says:

Peter, Glad the article rang true to your experiences.
Curious regarding your observation about data analytic companies processing needing to grow in parallel. Could you expand upon that a bit more?

Manan Dedhia says:

Very good article. I can particularly attest to the growing demands especially from automotive customers for more data mining and quicker turn arounds on data requests. This does serve to highlight the fragmented nature of databases that need to be looked at – Fab, probe, assembly, final test, module test etc. But within each sub sector, it is no longer an option to not have a data analytics tool and engineers specifically trained to look at this.

Anne Meixner says:

Manan,

Appreciate you sharing observations from your own experience. I like your comment about having engineers specifically trained to use data analytic tools. This aligns well with what Mike McIntrye stated.

Jnanadarshan Nayak says:

No wonder with the explosion in data volumes many Chipmakers are looking towards cloud to store and analyze the data. Its high time Semicon Manufacturers look cloud seriously like other industries.

Richard Collins says:

Anne, I am preparing for a seminar today (22 Feb 2024) titled, “Standardizing the Semiconductor Manufacturing Backend”. When I was reading the background on the presenters, this article of yours, “Too much fab and test data, low utilization” was a good resource. I try to deal with all data on the Internet for all issues and new opportunities. I had a boss one time who used to say, “We’ve got to get our duck in a row first”. But the usual course, in ALL industries and organizations on the Internet is “Do it, grab as much easy money as we can, then figure out what we are doing later”.

But global needs for 8000 Million humans now (and related species) means one-shot answers are not sufficient (I am trying to encourage the AI groups to use open methods so they can be trusted with life-critical systems).

Thanks for your hard work and insights.

Richard Collins, The Internet Foundation

Anne Meixner says:

Richard,

I am thrilled that an article written in Jan 2021 had relevance for you as you prepped for understanding semiconductor manufacturing and the data explosion.

As an engineer I’ve become wary of collecting all data just because we may need it. It’s okay in the beginning. But if no one is doing anything with it or no one has postulated situations in which its worth keeping then you need to be deliberate on how long you keep data and in what form of storage.

Leave a Reply


(Note: This name will be displayed publicly)