The Future Of Fault Coverage In Chips

System-level test offers speed and lower cost, but there are limits to what it can do.


Heterogeneous integration and sophisticated packaging are making chips more difficult to test, necessitating more versatile and efficient testing methods to minimize the time and cost it takes for each test insertion.

In the past, test costs typically were limited to about 2% of the total cost of a chip. That cost has been rising in recent years, and with chiplets, advanced packaging, and more domain-specific designs, that percentage has grown even further. The challenge now is to bring it back under control, and one of the key approaches to making this work is more system-level test. Traditionally viewed as a complement to advanced test equipment (ATE), SLT has moved from a supporting role to a central strategy in testing high-end, mission-critical devices.

“SLT has been around for a long time,” says Davette Berry, director for customer programs and business development for ATS at Advantest. “But what is challenging is to do SLT in a high-volume production environment on every single chip that’s being manufactured. The chips that are being tested usually are on the cutting edge of technology, so they’re the latest version of every interface, and the lifecycles of these processors and the ramp to market is forcing that SLT to remain as a test insertion in the manufacturing flow.”

While SLT can adapt to complex configurations, it does have limitations. When a test escape is discovered in SLT, it’s very difficult to isolate the exact location on the chip, which is essential to fixing the problem in the next revision. So while SLT excels in operational validation, it cannot achieve the high fault coverage ATE offers.

“It’s very broad coverage,” adds Berry. “The desire is to be able to diagnose how to fix an escape when it shows up in SLT so it doesn’t propagate to the next design. It’s very difficult in SLT to narrow down which transistor needs to be fixed.”

Put simply, no single test approach is sufficient anymore. “SLT is not meant to replace automated test equipment (ATE) but to complement it, particularly for high-end, mission-critical devices,” says Sri Ganta, principal product manager in Synopsys’ EDA Group. “Instead of just the structural test, SLT is more of an overall system test where you have analog components talking to digital components, and they’re controlled by the software, but the actual fault coverage is not high and the defects are not easily diagnosable on SLT. That’s why you still need structural test for manufacturing defects.”

Fig. 1: Testing now requires more test content from wafer probe through SLT. Source: Advantest

For many in the industry, especially those accustomed to the traditional ATE-focused approaches, embracing SLT as a permanent insertion involves a steep learning curve. This is compounded by the intricate nature of leading-edge ICs, where the interplay of various components and functions must be meticulously tested.

“SLT is still relatively new for most customers, and there is a significant learning curve involved,” says Daniel Oglebay, product marketing manager for the Integrated Systems Test Group at Teradyne. “Customers often need guidance on what to test and how to utilize SLT effectively, which can differ significantly from traditional ATE methods.”

The primary distinction between SLT and ATE lies in their testing scope and application. ATE is typically used for pinpoint accuracy in testing specific parameters or functions of a component, focusing heavily on detecting any physical defects or failures. It involves applying electrical signals to the pins of a device to measure its outputs against expected results, which is ideal for identifying manufacturing defects at a component level.

SLT uses a higher level of abstraction. While it doesn’t achieve ATE’s high fault coverage, its effectiveness stems from its ability to emulate the real-world operational environment of the chip. It can analyze the complex interactions and interdependencies between different sub-systems that might not be apparent in component-specific tests. For instance, SLT can reveal how power management algorithms affect the performance of a processor under various load conditions, or how changes in digital signal processing impact the analog output in a mixed-signal system. This integration allows manufacturers to fine-tune the systems to meet their customers’ requirements.

“SLT is used when you want to look at more complex IP block level interactions, like different frequency domains and power domains, and the actual operation of high-speed ICs and how they interact with each other,” says Oglebay. “If you want to run tool software and test how different IP blocks interact under different stress levels as software is running, that’s where SLT is much more effective.”

The transition to more comprehensive, ongoing SLT necessitates a shift not only in technology, but also in mindset. SLT demands a holistic view of the device under test, considering not just the individual components but also how they function collectively within a full system. This includes basic functionality tests as well as system-level stress testing and scenario-based validation, and it requires a deeper understanding of the entire architecture of the system.

“It becomes a cost containment challenge,” says Berry. “If you’re adding a test insertion, and you’re doing it for minutes, or tens of minutes, that is very costly. And once you’re doing SLT, and you have decided to retain that insertion for the entire life of that chip, how do you redistribute your test coverage to ensure you’re making it as cost effective and cost optimal as possible, and not duplicating tests from one of the other insertions? You need a test engineer who understands the coverage of all those different tests.”

Customizing SLT
The customization of SLT setups is largely driven by the end-use application of the device being tested. For example, a chip intended for use in automotive safety systems, which must operate reliably under a range of environmental conditions, requires different SLT parameters compared to a chip designed for mobile devices focused on power efficiency and high-speed data processing.

“There is a substantial educational component when implementing SLT. Many new customers approach SLT with an ATE mindset, expecting it to be some kind of protocol tester,” says Berry. “However, commercially available SLT equipment typically has to last for many years, so rarely has the latest versions of the cutting edge interface protocols. The customer’s application-specific test board typically has the relevant interface circuitry, and our equipment provides general power, control and communication.”

Developing an SLT strategy involves close collaboration between the test engineers and the device designers to ensure the test environment accurately reflects the conditions the device will encounter in actual use. This might include varying power supply conditions, simulating mobile network changes, or even emulating temperature fluctuations. Each scenario helps to ensure the device meets its specifications on paper and performs reliably in its intended environment.

“The process of setting up system-level tests is driven primarily by the end customer who defines the test process,” says Oglebay. “This often involves adapting our systems to meet specific device interfaces, power supply regimes, and thermal capabilities required by the customer. Each customer has unique requirements, and there is no one-size-fits-all solution.”

For instance, in automotive applications, SLT setups must simulate the wide range of temperatures and vibrations that automotive chips might encounter. This includes tests like thermal shock and stress testing, which mimic the rapid changes in temperatures that occur in automotive environments, from freezing cold to engine heat. Similarly, vibration testing ensures a chip can withstand the rigors of road use without failure.

For devices intended for data centers, SLT focuses on endurance testing and error checking at high data rates. These setups often include prolonged stress testing under high load conditions to ensure stability and reliability over extended operational periods. High-speed I/O interfaces, such as PCIe, are tested extensively to verify their performance across various lane configurations and generations, ensuring that data throughput and integrity meet the stringent standards required for server operations.

Such capabilities allow SLT to deeply integrate and manipulate the internal components, significantly enhancing the scope and depth of testing. This not only accelerates the testing process but also ensures that complex multi-component systems are thoroughly evaluated.

“A PCIe interface is a very large chunk of data, and the only way to really confirm that it’s working is to run it through its full state diagram and make sure that it can do all the sequencing whether in 1 lane, 4 lanes, or 16 lanes,” says Dave Armstrong, principal test strategist at Advantest. “Whatever the sequence is, unless you go through all of that you don’t know that the interface is fully functional. Once confirmed, you can use that for doing higher-speed testing and run scan patterns over the high-speed I/O interface in a much faster clip. Using SLT for that advantage is really reshaping how test is done.”

The future of SLT
As semiconductor technology advances toward increasingly sophisticated multi-chip systems and chiplets, testing approaches will need to become more adaptable. They will need to deal with more and different kinds of data that are specific to different domains and real workloads, which can vary greatly even within the same product line. And they will need to so more efficiently and more quickly, which plays to the strengths of SLT.

“SLT excels at taking a test scenario from a customer and setting up a test case to create a quick duplication of that customer’s issue,” says Advantest’s Berry. “And SLT, if designed right, is very inexpensive on a per-site basis. If you look generically at what the capital expenditure is on SLT versus ATE for final test and probe, it’s 10X to 100X cheaper. So even if you have minutes of test time, being on something that’s much cheaper makes the extra time irrelevant.”

Another increasingly critical application of SLT lies in the investigation of power and thermal dynamics of integrated circuits (ICs). Thermal testing is essential for ensuring that devices operate reliably under the diverse temperature ranges they might encounter in real-world applications. Advanced SLT setups can simulate thermal conditions that ICs face, including the heat generated by high-speed operation and environmental temperature fluctuations. This type of testing is particularly important for devices used in automotive and aerospace applications, where temperature extremes are common, and in AI applications where chips are designed to run at maximum speeds to process enormous volumes of data more quickly. And increasingly, SLT itself will require AI to sort through various scenarios and connections that are too complex and numerous for engineers.

“A lot of these high-end and mission-critical devices are now adopting scan tests over functional high-speed I/O interfaces and required IP is being integrated into the device for System-Level Testing (SLT), a trend that’s only growing,” says Synopsys’ Ganta. “Also, to achieve optimized test patterns to address exponentially growing test times, you need AI, which can fine-tune various settings in ways that are either impossible or too time-consuming for humans. It’s about leveraging computational AI and machine learning to handle the combination of complexities of design and the parameters used to generate the test patterns.”

The need for standardization in SLT practices also is becoming apparent as the technology matures. Discussions within the Universal Chiplet Interconnect Express (UCIe) standards group have begun to address the lack of commonality in test designs. The integration of SLT in high-volume manufacturing is still relatively novel, and there is a considerable gap in shared data that could be used to standardize testing protocols. Establishing these standards will be crucial for enhancing the reproducibility and reliability of test results across the industry.

“As industries mature, the role of SLT is evolving to optimize its position within the test and validation chain,” adds Teradyne’s Oglebay. “This involves a strategic shift of test capabilities between ATE and SLT to achieve the most cost-effective outcomes. We also can expect more standardization in test cases as industries progress. That helps everybody by streamlining the testing process and reducing customization costs.”

As complexities in semiconductor technologies expand, SLT is solidifying its role as a crucial complement to traditional functional testing methods like ATE. SLT’s comprehensive approach, which adeptly simulates real-world conditions, is invaluable for evaluating mission-critical ICs. However, it does not achieve the high fault coverage and granularity that ATE provides, underlining SLT’s role as a supplement rather than a replacement in the testing regime.

Educating customers about SLT’s strengths, capabilities, and limitations is becoming increasingly important. Stakeholders must understand that while SLT offers a broader operational perspective, it may not detect every nano-sized manufacturing defect that ATE can identify. Enhancing customer understanding will help in optimally integrating SLT in the testing strategy.

Looking to the future, further standardization and strategic integration of SLT in the testing chain are expected to enhance the semiconductor industry’s capabilities in driving reliability and fostering innovation. As SLT becomes more standardized, and its applications more refined, it will continue to evolve as a pivotal component in the semiconductor testing landscape, contributing significantly to the industry’s efforts to meet the increasing demands for quality and performance in an era of complex device architectures.

— Laura Peters contributed to this article

Related Reading
Hidden Costs And Tradeoffs In IC Quality
Why balancing the costs of semiconductor test and reliability is increasingly difficult.
Customizing IC Test To Improve Yield And Reliability
Identifying chip performance specs earlier can shorten the time it takes for processes to mature and lower overall test costs in manufacturing.

Leave a Reply

(Note: This name will be displayed publicly)