Finding And Applying Domain Expertise In IC Analytics

It takes a team of experts to set up and effectively use analytics.


Behind PowerPoint slides depicting the data inputs and outputs of a data analytics platform belies the complexity, effort, and expertise that improve fab yield.

With the tsunami of data collected for semiconductor devices, fabs need engineers with domain expertise to effectively manage the data and to correctly learn from the data. Naively analyzing a data set can lead to an uninteresting answer, and it can send engineers scrambling to solve a non-existent problem, which is a waste of time and resources.

While data storage costs remain low, porting all the raw data into a data analytics computation is neither effective nor desirable. For even the simplest of analyses, the analytic platforms process must be reliable and the data needs to be aligned. Only a team of experts has the collective and nuanced knowledge to properly encode and store the data to facilitate complex analytic use cases.

Increased device traceability offers the prospect of learning across the supply chain to create models that predict yield excursions and identify potential quality mishaps. To do so with confidence and swiftness requires engineers with content knowledge regarding manufacturing equipment, test measurement, on-die monitors and system data. This also requires engineers who understand the data pipeline flow and use cases to build the underlying data structures needed to fuel the data analytics engine.

Examples of misapplication of data analytics exist all along the semiconductor supply chain due to lack of knowledge. Two examples:

  • An IC device experiences a yield excursion at wafer probe. The test data indicates a problem in a specific IP/circuit block, and engineers find a correlation between the failing IP and a transistor parameter. Weeks of effort go into changing the transistor parameter’s number, yet the devices still fail at test. Eventually a call is made to the team that qualified the product, and the team learns the IP circuit block doesn’t contain that transistor type. The engineers were pursuing a spurious correlation.
  • A consumer electronics company provides a general data analytics company with all the manufacturing data from equipment monitors to factory employee ID numbers in a move to increase its SMT yield. The data is crunched with an array of data analytic techniques, and the analytics firm shares a yield dependency — solder capability is affected by reflow oven temperature control. Approximately $4 million was spent to determine a simple dependence that any engineer or technician on the factory floor would know about.

So how can data be successfully leveraged? Consider a fab that operates with all SPC indicators in control, yet has impactful yield excursions. The data analysis team looking at all the data identifies more than 5,000 parameters that show a correlation to yield excursions, which at first glance isn’t obvious. Reducing these to the top 10 requires expertise on the semiconductor process side. A data scientist simply doesn’t know which ones to ignore. For that you need semiconductor process and design experts.

Types of domain expertise needed
Naturally it is not possible for a single engineer to know all the design and manufacturing data from fab to in-field application.

“We would be doing a disservice saying that there is an expert available who understands all of these factors. It’s not rare that you would find an expert in any one area,” said Mike McIntyre, director of software product management Onto Innovation. “But to find an expert who covers all these areas is probably non-existent. At best, you may find people who are experts in one area, two, with a passing familiarity and understanding of the other spaces.”

But that passing familiarity comes in handy. “Part of being a domain expert is also the awareness of where you’re not a domain expert, and understanding where the fidelity of your knowledge starts to drop off,” said Rob Knoth, product management group director in the Digital & Signoff Group at Cadence. “There’s a value in having domain expertise, but you have to be really aware of when you need other domain experts to partner with you. And then that helps create a better solution than if you just tried to go alone.”

Simply put, a team of domain experts is needed to properly set up the data analytics platform for success. Joe Pong, senior advise analyst from Amkor’s IT team, describes the major areas needed to bring about this success. “In our experience, for a successful big data project, three roles/experts are crucial,” he said. “First, end users (normally engineers), which are the consumers of the project can give use cases, value (benefits) to a company, and guidance. Second, data scientists can help with devising and exploring the insight of data through sophisticated statistical graph. And last but not the least, IT experts will design and implement the data pipeline and computing architecture to ensure performance of the project is acceptable, and the retention of data is sufficient.”

Fig. 1: Fundamental domain expertise is the fuel behind an efficient data analytics operation. Source A. Meixner/Semiconductor Engineering

Understanding the various use cases provides additional insights into the breadth and depth of expertise that engineers bring to supporting a successful implementation.

“You need multiple people from each domain,” said Paul Simon, group director of silicon lifecycle analytics at Synopsys. “In our case, this includes manufacturing, design, test, assembly, and in-field for the on-chip data. So all that data has very different structures. It has different ways by which it arrives into our systems, different ways it gets collected, and different content. For example, you want to have on-chip observability through sensors. You need to define how that data ends up in your analytics system so you can correlate to manufacturing or diagnostics data. This requires people who understand the on-chip sensors and the associated data. Next, you need people who understand how you efficiently perform data ingestion before you store the data. Then there is data model or database expertise, which is how to store that data, in what tables, what particular database type. There is never just one database. Naturally, for different technologies we have different databases to handle different kinds of use cases or different requirements that we have for the data.”

That kind of specific data is critical. “We look at certain kinds of problems, and we’re designing the algorithms to fit those problems,” said Marc Hutner, senior director of product marketing at proteanTecs. “Essentially, we co-design between the circuit IP and the algorithm. To do this, we have a group that consists of machine learning and database experts. Then, we also have the design IP generation experts. Our value is in bringing the experts of the IP design with the ML together with the database experts to create analytic solutions specific to these problems.”

Every piece of equipment requires deep knowledge, as well. “When you buy test equipment from an ATE company, you’re also buying the expertise on how to use it and how to apply it,” said Eli Roth, smart manufacturing product manager at Teradyne. “We can partner with our customer and highlight what’s relevant in a test, what’s going to help them create connections and insights, and what’s unique in the test equipment that can create a useful signal. In addition, as the ATE expert, we know how to get things out of the system in real-time, or near real-time. If they want data to drive control in our testers, they don’t know all the control knobs. Our knowledge assists them in this area.”

Fig. 2: Design and manufacturing expertise at every step is tied together in a well designed data analytics platform. Source: A. Meixner/Semiconductor Engineering

Connecting the data dots
While the individual pieces are critical, successfully using and interpreting analytics from manufacturing to in-field data sources requires engineers/users who possess a holistic picture. In other words, you need a generalist.

“The term ‘domain expert’ is somewhat abused,” said Gajinder Panesar, fellow at Siemens EDA. “In my opinion, it means being able to understand the design spectrum from architecture to implementation, from verification to validation and in-field usage. You need to be in touch with all those things. Having a domain expert who’s just a verification engineer, or someone who just knows how to code RTL and nothing more, is not enough. You need to have people who understand where all that data comes from and how it can be used.”

Others agree. “Typically, piecing data together is still highly dependent on individuals who understand the content and how these different data domains come together,” said Onto’s McIntyre. “In the marketplace going forward, there is value for the jack-of-all-trades from a connectivity aspect. They don’t necessarily have to be an expert in any one area. But having a functional knowledge across the wider space allows them to ask the right question or to sniff out the wrong signals.”

But it’s not just the content expertise about the design or manufacturing. Engineers who understand the data operations and IT systems are key, as well.

“Advantest has amassed significant experience and expertise regarding how data across different test insertions affect models and data analytics. We can cite several use cases to highlight this area,” said Keith Schaub, vice president of technology and strategy at Advantest America. “We are building models both internally and in conjunction with several customers and external partners focused in this area. Advantest has specialized knowledge in the IT, automation flow since we work across the entire wafer-sort and OSAT ecosystem, which has customized IT requirements.”

Designing and manufacturing semiconductors has only become more complex, and the interaction between values necessitates domain expertise. When connecting measurements from multiple sources, engineers must understand what and how to model, while knowing which event data should be stored from those measurements.

“It’s easy to track univariant effects,” said McIntyre. “It’s a lot harder to track multivariant effects. Consider this scenario: I could have a narrow metal 1, but it’s still in spec so my product should be good. But that material happened also to get a narrow metal 2, also within my spec, and unfortunately this product also got a narrow metal 3. Again, all three individual layers from a linewidth perspective are in spec and in control. But the cumulative effect of having those three layers all thin created a failure. The failure could have been timing-related, could have been power density, could be electromigration-related. But the point is, it’s the cumulative effect of three things that were all in control individually. Had the metal 2 been normal, or had the metal 2 been fat, I wouldn’t have seen this effect.”

Understanding at the physics level guides the selection of data that might matter.

“Looking at this from a physics perspective, you pick the right response to look at. Then you rely on observing this response through different stages of the process — as early as E-test, then wafer probe, final test, burn-in, eventually in the field,” said Andrzej Strojwas, CTO at PDF Solutions. “You now have the technology to use on-die sensors. This is where you need to understand the physics underneath guides, picking the right parameter and designing a sensor to observe it.”

Along with any measurement, engineers find value in saving the context in which that measurement is taken.

“It gets really complicated, especially on the test side,” said proteanTecs’ Hutner. “You want to record all these other conditions and information about why you made the decision the way that you did. In the past, that hasn’t been well captured — if at all. So it’s usually a single test value with no context. And if you don’t connect those other pieces of data, metadata, at the time of measurement, it’s very challenging to figure out which pieces to include after the fact.”

What data insight matters
Understanding the use case matters because it determines the data that’s needed, the relationship between data sources, and the important insights that can be gained from it. Otherwise, data relationships are created and data is stored in a manner that has no value to the questions you want to answer.

“We now brand those products as tests and embedded analytics. We have a series of monitors that get embedded in the chip, and basically they can gather data throughout the whole lifetime of the chip,” said Aileen Ryan, senior director, portfolio strategy for Tessent silicon lifecycle solutions at Siemens EDA. “At different stages of their lifecycle, different types of data are going to be useful and interesting to you. One of the real challenges that we have working with customers is gaining a mutual understanding of which data is necessary. You can gather lots and lots of data, but ultimately, what does it mean? And is it really useful? And honestly, you have no business if you’re just sucking up all the data and hoping that it’s going to be useful at some time in the future.”

Consider the design of a 5G system, for example. “Somebody designing the base baseband design doesn’t know how the interconnects work and how outstanding transactions in flight can affect the overall end system,” said Panesar. “Collecting raw functional data means nothing to the person doing the baseband. They want those numbers translated into something meaningful to them.”

Content and database experts together can best decide how to store the data so that it is efficient.

“You test a wafer, and that results in a data file of 14 megabytes, which contains parametric data, functional data, binning data, etc. The data structure of test data is completely different from fab equipment sensor data,” said Synopsys’ Simon. “The data scientists and the data modeling engineers will say, ‘Well, we’re not going to store every millisecond. We’re going to store only a summary of it.’ But then you need to have the domain experts involved to make sure this summary actually covers the use case you want the analytics to support.”

Strojwas concurs, noting that sometimes it’s not the absolute value but the shape of the curve that is important. “I would never want to store the entire time series out of an equipment’s sensors because that that would be cost-prohibitive. But we had some examples that were really unexpected, like in a rapid thermal anneal (RTA) process. When we looked at the temperature ramp, it turned out that big differences in terms of the actual temperature did not necessarily matter. Sometimes the slope or the shape of the waveform matters.”

For new use cases, data analytic tools need the flexibility to enable domain experts to connect their data in novel ways.

“Of course, when we deliver the tool, we have some preset rules, which allows a customer to immediately take advantage of the tool. But everyone has different needs. Sometimes it’s way different than what we imagine. The user wants to launch some very innovative way of manipulating information that they have,” said Philippe LeJeune CTO of Galaxy Semiconductor. “For that you need an open tool where you can plug your own IP, where you can connect your own plumbing. Yet you have a lot of bricks you can readily leverage. Sometimes we are really impressed by what our customers are doing. But that’s great, because they do even more than what we could think of, and that’s kind of exciting.”

Often end users seek new insights when combining data sources for the same material or product.

“Consider that my design uses three agent types each for a particular kind of insight, e.g., workload, temperature, timing,” said Hutner. “If I wanted to create a new kind of insight, I could throw these three agents into the data analytics solution and, in effect, collect metadata. This takes a very different approach to thinking about the kinds of data our customers can use with these agents. So that represents a scalability aspect about how we can combine of our data. But then we’re also thinking, ‘How do we bring data from other sources together to drive something new out of the platform?’”

In the end, it comes down to storing the correct data to facilitate the use cases that provides insight engineers and engineering managers can act upon. To do so, you need a team of experts.

“I’m building teams to do this, and what I see is that you need all these levels of expertise working together,” said Simon. “Otherwise, it’s not going to work. Somebody who understands machine learning tools is not able to build a model because she doesn’t understand the domain. So you need to have next to her somebody who understands the domain and somebody who understands the analytics. This holds for machine learning and for other for statistical models, or more classic type of analysis. If you want to design a chart, a correlation chart, or a trend that is useful for an end user to draw certain conclusions, you need to have the domain knowledge. And if your data model doesn’t support that, then you have difficulties.”

Related Stories

AI In Inspection, Metrology, And Test

Big Payback For Combining Different Types Of Fab Data

Enablers And Barriers For Connecting Diverse Data

Too Much Fab And Test Data, Low Utilization

Data Issues Mount In Chip Manufacturing

Leave a Reply

(Note: This name will be displayed publicly)