Using Smart Data To Boost Semiconductor Reliability

Tools and algorithms are improving faster than a willingness to share data, but at some point economics and talent shortages will force big changes.


The chip industry is looking to AI and data analytics to improve yield, operational efficiency, and reduce the overall cost of designing and manufacturing complex devices. In fact, SEMI estimates its members could capture more than $60B in revenues associated through smart data use and AI.

Getting there, however, requires overcoming a number of persistent obstacles. Smart data utilization is currently in its infancy largely due to the challenges of ensuring data security, the substantial upfront investment in data processing and analytics, and many other issues. And while the industry is responding with proactive steps, taking both pre-competitive and collaborative approaches to implementing AI/ML and data analytics solutions, none of this is happening as fast as proponents would like.

“One challenge in this area is that data is being cherished and hoarded,” said Eli Roth, product manager for smart manufacturing at Teradyne and a member of the SEMI AI Industrial Advisory Council. “People are concerned that if they make their data available, somebody else is going to monetize it. A second challenge is that those data models are definitely expected to bend the cost curve, so their development is seen as a competitive advantage.”

The end goal is to enable smarter and less costly yield management, metrology, and inspection approaches, test insertions, and long-term lifecycle monitoring of semiconductors in use. But chipmakers have serious concerns about outright data/IP theft, data leakage, and disclosure of the ML/AI models they use to achieve competitive advantage.

“They asked us to encrypt their data on our testers,” Roth said. “But in a bigger sense, data analytics seeks to correlate the results of a front-end etch step to the impact on a known good die. So today we are focused on test productivity, like moving from predictive to preventative maintenance, reducing consumable costs, and energy use. Tomorrow it’s about information flow, feedback, and feedforward to provide device insights.”

In general, the industry is making a transition from a strong reliance on module (litho/etch, for instance) optimization and rules-based algorithms to a reliance on machine learning-based algorithms for making predictions about processes and yield going up and down across the semiconductor ecosystem.

“In the past, you would calibrate the equipment within the fab and make sure it is running within spec, and through a lot of trial and error and legacy knowledge you would reach yield goals and continuously improve it,” explained Vivek Jain, senior product manager at Synopsys. “But with best-in-class modeling capability of fab tools we are trying to effectively change the paradigm to achieve faster process ramp and high-volume manufacturing with better yield.”

Jain said one key to this implementation is the simultaneous modeling of defect inspection data, equipment data, and metrology data, rather than optimizing unit operation individually. “If there is a defect excursion happening, you can automatically drill down and determine the process root cause of the yield loss.”

Over the last couple of years, ATE manufacturers have invested heavily in real-time compute-intensive processing, which can rapidly deploy as-needed follow-up tests and reduce the cost of test through outlier detection, adaptive testing, predictive modeling, and massive computational capability (see figure 1). “We are using the real-time data infrastructure that can be likened to the digital super-highway, on top of which we’ve built analytics,” said Michael Chang, vice president and general manager of Advantest Cloud Solutions (ACS). “But because it is an open architecture, customers also can incorporate third-party analytics. This approach unifies the server that is sitting on a test floor with built-in data security and application deployment.”

Fig. 1: Advantest Control Solutions use cases enable quality and yield improvement, preventing field failures, and reducing the cost of test (COT). Source: Advantest

Fig. 1: AdvantestACS use cases target quality and yield improvement, prevention of field failures, and reductions in the cost of test (COT). Source: Advantest

Shrinking design and process margins is not new to the semiconductor industry. Yield management systems continually evolve with advances in data analytics and learning models, enabling more efficient characterization of defects, better correlation of process tool sensor data to advanced process control, and the combination of on-chip monitors and tester analytics. What’s new is a fundamental shift to data sharing among different parties, which is driving the need to secure and encrypt data, as well as what appears to be redundant capability between analytical approaches among yield management, ATE, EDA, and chipmakers.

YMS and metrology
One of the key functions of a fab’s yield management system (YMS) is to detect warning signs that a parameter or process is drifting and send alerts that corrective action is needed.

“The more data we can ingest into our system — meaning metrology data, FDC data from the tools, PCM test data, defect data, etc. — the better,” said Dieter Rathei, CEO of DR Yield. “I’m often asked, ‘How do we select the kind of parameters we should be monitoring?’ We always recommend a kind of brute force attack. Monitor all of them. You may have, let’s say, 100 key parameters you want to monitor. You’re monitoring these parameters closely and have feedback loops, but because they are so tightly controlled, nothing typically happens to them. So the next level of parameters become the critical ones, because now these are the uncontrolled parameters. The only way out of this kind of conundrum is to monitor everything,” said Dieter Rathei, CEO of DR Yield.

Engineers often refer to the advantage of having a digital thread when it comes to correlating data sources. “The availability of powerful algorithms for yield management means the YMS can track exactly how many devices failures are screened by a particular algorithm,” said Rathei. “That provides a means of tracking the number of devices that have been lost by every algorithm, and what’s the yield loss associated with the different screening mechanisms. This is being closely watched, because there’s always this tradeoff between quality and cost efficiency.”

Chipmakers are embracing powerful algorithms and database approaches to shorten the design debug process, which currently take between 12 and 18 months for a new process node. In some cases, up to 50% of the failed product cannot be traced to its root cause in manufacturing, which is giving rise to industry partnerships on both the hardware and software sides to enable greater efficiency.

For example, proteanTecs and PDF Solutions collaborate to help customers reveal design marginality or design-process sensitivities in order to improve yield, quality, reliability, and time-to-market by strategically combining the on-die data sources and analytics modeling together with PDF’s Exensio analytics platform.

The platforms use different data sets and collaboratively analyze real-time data from testers with on-device monitors to analyze yield, bin distribution, and other parametric statistics using visualizations. This approach identifies inferred process parameters (IPP), which is essentially a die-level scoring of parameters to trace the cause of yield loss.

Latent defects, FA, RMAs
With continued scaling and the introduction of 3D device structures, the industry has a growing problem with latent defects. These include defects that do not fail at wafer probe, as well as packaged device testing or system-level testing. But they precipitate as field failures (RMAs), which in turn is driving alternative metrology and testing approaches.

More data, and the right kind of data, is needed to characterize the true source of latent defects. “We build metrology tools that rely on the physical standards that come off of TEMS and SEMs, and we’re data starved there when it comes to our analytics and finding the root cause,” said Mike McIntyre, director of software product management at Onto Innovation.

McIntyre described cases of ultrathin metal interconnects that induce electromigration and result in eventual device failures because of a process marginality. “The key is being able to connect the right data to draw your conclusions,” he said. Still, this is particularly challenging when it comes systematic defects, which require extensive feedback from failure analysis of parts.

Lifecycle management approaches, meanwhile, seek to connect formerly disparate data sources. “With Fab.da, we are creating an end-to-end data continuum, all the way from mask design to manufacturing and monitoring device performance in the field,” said Synopsys’ Jain.

The basic idea is to improve fab-wide decision making and control costs. This is becoming particularly problematic for mission- and safety-critical designs, where demand for reliability requires higher metrology sampling rates and more test insertions.

“You need higher sampling rates, especially when moving towards higher quality requirements, for instance driving towards parts-per-billion defect rates in automotive applications,” said Frank Chen, director of applications and product management at Bruker. “One key way to achieve that is to sample often, and sample earlier in the process. Cost modeling shows that in some applications, it makes sense to use 100% inspection, while in others, even a 30% sampling rate will deliver most of the gains, so there are tradeoffs. X-ray metrology can be inserted right after die-attach to capture what’s happening on the die with respect to bond quality.”

Chen noted that with random sampling only, such as for wafer bumps, the features of interest often are not captured, such as localized warpages.

An ideal yield management strategy
So what does an ideal yield management strategy really look like?

“Today an ideal yield management strategy is integrated into a fully automated factory that is getting all the data from all the tools, all the test data, tool history, combines all the data in a structured data warehouse, and it runs all our anomaly detection algorithms on the data and lets people watch the developments that are detected by the software and act on them,” said Rathei.

However, with the necessity of hot lots, split lots, monitor wafers in a lot, and other logistics, semiconductor manufacturers can underestimate the complexity of their own data and underestimate how much work goes into the implementation of a YMS in working fabs and OSAT operations. “We have all the necessary components, like automatic recipe upload, automated scheduling of lots, automated data collection, etc. But attention must be paid to the details. For instance, if you look at the wafer coordinate systems, the analysis must capture the exact location of a defect on a bin map, which correlates with the defect inspection coordinate map locations because even if there’s an offset of a few micrometers, defects could be associated with the wrong chip thereby leading to incorrect killer defect flagging, so a lot of work goes into validating these overlays,” added Rathei.

Economics is always an issue, however. “The reason you don’t see 100% inspection in a lot of the semi areas is that cost of ownership just does not get there, the tools aren’t fast enough and the floor space is too expensive,” said Brad Perkins, product line director at Nordson Test & Inspection. “Really, you want to know the data out of every part because then you can catch process drift faster. You can have your spec limits tighter. And the closer you can get under spec limits to where your control limits are, the better your product is and the less failures you’re going to have in the field long-term, because electrical parameters are going to be closer to an ideal mark.”

Consider a power device that has been encapsulated and underfilled, for instance. “Failures often relate to voids that are caused by stress concentrators that are going to cause either thermal or mechanical failure, which shows up in reliability testing,” said Perkins. “Ultrasonic acoustic imaging is well-suited to detecting voids in underfill, either before or after curing, because high-density materials like solder cast a dark shadow, and air pockets or pores cast a light shadow when imaged. “

That data, combined with other data, can help pinpoint problems earlier in the fab and packaging houses. “It’s moved from a point of just collecting the data every now and then, to looking at the data collected and saying everything’s under control,” Perkins noted. “It’s almost feeding on itself and improving the process yields. That’s why we’re seeing the sampling rates and inspection requirements going up across the board. The more you inspect, the more data you get, with the caveat being the data we’re getting out today must be utilized. It’s not data for data’s sake, which might have been the case decades ago, where we collected it but either didn’t have time to analyze it or the relationships were too complex to tie together Machine A, Material B, Machine C, Material D, as well as process parameters and humidity. Now we’re able to identify interactions we never would have found in running a standard DoE on a couple of pieces of equipment.”

The industry increasingly is looking to ML data analytics to address shrinking design and process margins, latent defects, and to more rapidly ramp yield on new process nodes. For now, collaborations are beginning to show what is possible when data from EDA, yield management, metrology, inspection, and test is linked together, and massive computing power is brought to bear on fab operations.

The chip industry is poised to reach $1 trillion over the next several years as semiconductor demand continues to grow. But there will never be enough engineers and technicians. Some form of AI will be required to fill in the gaps, making engineers more productive while improving the reliability of products.

“There’s a dearth of subject matter experts, so advanced modeling capabilities and the power of AI and ML will definitely boost the ability of engineers to get to yield entitlement faster, and enable accurate root cause analysis,” said Synopsys’ Jain.

Related Reading
Customizing IC Test To Improve Yield And Reliability
Identifying chip performance specs earlier can shorten the time it takes for processes to mature and lower overall test costs in manufacturing.
Streamlining Failure Analysis Of Chips
Identifying nm-sized defects in a substrate, mixing FA with metrology, and the role ML in production.

Leave a Reply

(Note: This name will be displayed publicly)