Testing In Context Gaining Ground

Why simple pass/fail testing no longer works.


Testing in context is beginning to gain wider appeal as chip complexity increases, and as ICs are deployed in more safety-critical and mission-critical applications.

While design in context has been the norm for SoCs for some time, a similar approach in test has been slow going. Cell-aware testing technology was first described a decade ago, and since then its adoption has been modest. But with rising heterogeneity, various modes of operation, yield concerns at the most advanced nodes, not to mention an increase in advanced packaging, that kind of testing approach is gaining traction and starting to spread to other areas of test.

At the recent International Test Conference in Washington, D.C., new papers and presentations focused on device-aware testing and variation-aware testing. There was even a tutorial on power-aware testing for IoT devices.

Zhan Gao, a graduate student at Eindhoven University of Technology, presented a paper at ITC on “Application of Cell-Aware Test on an Advanced 3nm Technology Library.” According to the abstract, cell-aware test (CAT) targets realistic internal manufacturing defects in cells. Using a CAT flow from Cadence that focuses on defect location identification and characterization, along with cell-aware ATPG [automatic test pattern generation], researchers reported a 73.5% reduction in defects for an experimental standard-cell library in imec’s 3nm CMOS node.

Other papers reported successful implementations of context-aware testing, as well. Moritz Fieback, a graduate student at Delft University of Technology, presented a paper on “Device-Aware Testing: A New Test Approach Towards DPPB” (defective parts per billion). The Delft researchers examined the impact of physical defects on a device’s electrical parameters prior to circuit and fault simulation.

Stefan Holst, assistant professor at Kyushu Institute of Technology, presented a paper on “Variation-Aware Small Delay Fault Diagnosis on Compacted Failure Data.” Researchers there targeted small delay faults in advanced-node chips, which are difficult to detect due to limited available failure data and unknown delay variations. Their work utilizes a new algorithm to diagnose timing issues.

A paper on generating timing-slack-based cell-aware tests also was submitted at the conference.

To put this in perspective, researchers are returning to context-aware testing as a way of adding granularity into the test process for complex designs.

“When you’re dealing with testing and automating testing, there’s always going to be an imbalance between the convenience of the fault models,” said Geir Eide, product marketing director for Tessent Design-for-Test at Mentor, a Siemens Business. “It’s kind of an oversimplification of the problem of the actual defects and how accurately those respond to reality. For 30 years, people have used ‘stuck-at’ in decisions. Whenever there is a new process, or just for the sake of its being new, there’s always a lot of interest in your fault models. And whenever there is kind of a fundamental shift, like finFETs, there is usually interest in adoption in fault models. In addition, when there is a shift into new markets that have different quality requirements — like, right now, with automotive — there are a lot of companies that are suddenly looking into ways of improving quality.”

One of the new twists involves more data being collected during various process steps, which can have a big impact on test. The challenge is that not all of that data is relevant, and not all of it is clean.

“It’s important to test in context,” said David Park, vice president of marketing at PDF Solutions. “The problem is that in some cases there is too little data, and in other cases there is too much. The amount of data being generated today is increasing faster than Moore’s Law can handle, and what you’re really looking at here is a single device traceability problem. You want to know which piece of silicon is faulty, and to do that you need to know the exact lot and the date of that production. From there you want to go back and check where that lot came from.”

To be effective, domain expertise needs to be layered across that, along with a combination of real-time and near-real-time analysis of that data. But that, in turn, creates other issues.

“The key is allowing test engineers to have an app on their testers that is upgradeable over the Internet, and which displays the contents of the standard test data formats generated on the tester,” said John O’Donnell, CEO of yieldHUB. “You also need to analyze and chart them, so you want to use the cloud to ingest the output of the edge extremely quickly and allow volume analysis, as well as access to volume analysis on the tester, for a comparison with what the current lot is doing. Comments can be added via the edge, which builds up the knowledge base in the cloud. So the edge and the cloud must work in tandem and across all operating systems.”

Increasing awareness
At least part of the challenge is that what’s being tested for one application may be significantly different for another. This is particularly true in the 5G space, where there is no precedent for mass testing of millimeter wave antenna arrays. As with other aware-type testing, much of the testing is based upon statistics and data distributions.

“Most of it is measurement parameter-type testing,” said David Hall, chief marketer at National Instruments. “In the specification document of a device like an RF front-end module or a beamformer, that device will have a guaranteed specification around certain performance characteristics. It will have output power and modulation quality, and it will have metrics and performance specifications that are specific to the standards body-type definition. So for example, the 3GPP (3rd Generation Partnership Project) will specify for a 5G device that the modulation quality must be better than X percent, or the emissions into an adjacent channel must be better than X dB. And if you’re building a device like a power amplifier, you might specify that the emissions in a harmonic channel may only be X number of dB. You can add on some of the measurements around DC, like current consumption and efficiency. All of these kinds of metric level measurements are what we would classify as the parametric measurements, or parametric testing that’s done to verify a device.”

One caveat is that tests based upon data also need to be aware of other tests based on other data. So even though a test shows a particular device is operating within an acceptable range, multiple chips in a package or multiple devices working together can affect each other. In some cases, variations can be additive or multiplicative.

“There is a lot of talk about adaptive test in manufacturing,” said Michael Schuldenfrei, corporate technology fellow at OptimalPlus. “This is important if one of your key inputs is changing. So instead of just taking information from inside of the device being tested and manufactured, you look at all of the data holistically. But that also requires real-time access to data from previous tests and current tests. Then you can figure out what to do to improve performance of the validation process. So you may not need to test every device if you understand exactly what is happening in the manufacturing flow.”

Evolution in test
All of these changes are a recognition that test needs to keep up with all of the other changes underway in the semiconductor industry.

“Both cell-aware, as well as the device-aware concept that was introduced [at ITC], look at the actual models of the cells and see how they behave,” said Mentor’s Eide. “Cell-aware is the only one that’s really reached a level of maturity, where it’s available in commercial tools and also being adopted by the industry. It was introduced about 10 years ago. It has a good rate of adoption. That being said, concepts like cell-aware are not very well-defined terms. How we use it in context of our tools certainly evolved over the years, so cell-aware today means something slightly different than what it did 10 years ago. The basic idea is that you have to do something to ensure that all of the defects inside of standard cells are detected. That’s what everything cell-aware has in common. It’s getting pretty significant adoption in market segments like automotive and for finFET-type processes.”

He noted that device-aware and variation-aware testing both are positioned as new concepts, sponsored by research institutions, but so far they don’t have widespread adoption.

“These new types of nonvolatile memories, such as RRAM and MRAM, are envisioned as the next replacement for flash memory,” he says. “This particular test method is not really limited to testing those memories, but it was presented in context of these more advanced memory types. Neither those memory types, nor this particular approach, is being adopted by the industry yet because this is a new approach. The fundamental thinking behind this approach is very similar to that of cell-aware, in that you take more of the physical behavior of the cells into consideration.”

Variation-aware testing, as presented at ITC, was more focused on diagnostics than on test, Eide noted. “Defects that are causing a tiny delay for a signal on the chip is something that is demonstrated to happen much more in finFET processes than planar simply because that is the type of defect that will happen if one fin is broken. It won’t cause the transistor to no longer work. It will just make it work slightly slower. This variation concept is a way to make diagnosis of that kind of defect better. The one kind of test is screening what you can ship and what you can’t. Ramping up process-improving yield, like understanding why devices fail, is also important. We are seeing more of these small-delay types of defects. Having better ways of diagnosing them is going to be important.”

Testing in the context of cells, devices, various types of variation — and even in the context of systems and systems of systems — is getting a lot more attention as design complexity increases and as devices are used for longer periods of time in mission-critical and safety-critical applications.

“We’re taking layout into consideration, we’re taking power into consideration with each incremental improvement,” said Eide. ” What all of these things have in common is that we’re basically saying that these simplifications that we did in the past, they don’t quite work anymore. We’re trying to get closer to reality. The cost associated with that makes it more complicated and more compute-intensive.”

—Ed Sperling and Susan Rambo contributed to this report.

Leave a Reply

(Note: This name will be displayed publicly)