Time For Massively Parallel Testing

Increasing demand for system-level testing brings changes.

popularity

Time is money in electronics, as in other industries, and the more time that is invested in testing chips means more costs being added to the product in question. To speed up testing for memory devices and other semiconductors, test equipment vendors have resorted to parallel testing technology, simultaneously testing multiple chips at a time.

The industry also is turning to system-level test, which is viewed as crucial for a new generation of electronic gadgets such as mobile devices and wearable electronics. If a chip passes a regular in-socket test, and then fails when it is in the end-use product, that’s more time that has to be spent on finding out why the failure occurred. And that means more money.

The cost of test has always been a concern in the chip business. The advent of automated test equipment addressed the issue to some extent. As chips grew more complicated, and different types of devices emerged, ATE had to adapt to the changes, and that led to more work on test handlers, which automatically feed chips to the tester.

Memory chips tend to be the simplest chips to test, and the use of parallel testing sites first went to the memories. The number of sites grew from 8 to 16 to 32 to 64, and now they’re extending to 256 and 512 sites at a time.

ATE and design-for-test techniques are estimated to address 99.5% of test coverage for devices under test. That may sound like a lot, and it represents a good step toward the goal of semiconductor quality control. Still, that 0.5% portion represents a lot of failures among millions of chips produced.

Screen Shot 2017-04-11 at 8.35.07 PM
Fig. 1: Testing gap widens at each new node. Source: Astronics

“In my mind, there are two different tracks going forward in the industry,” says Anil Bhalla, senior manager at Astronics Test Systems. “In ATE, parallel testing went from 2 to 4 to 16, and in time you can start doing 32 and maybe 64 [sites]. A lot is depending on your instrumentation. You have a fixed test-head size, and ATE vendors continue to add more digital pins and sites. It’s evolved in that fashion. On the system-level test area, the industry has done the same thing, structuring to 16 or 32 sites in parallel.”

Astronics uses a different architecture, one that essentially slot-based, which can go up to 396 sites in parallel.

“It’s a dramatically different type of system that enables people who want to do system-level test in high volume to get some pretty exciting cost-of-test reductions,” Bhalla says. “Some of the other approaches up until now have been limited in an ATE system by the number of slots you have and how big the test head is. They are doing different kinds of instantiation, so if you get into RF, then RF gets expensive. It’s just hard to get essentially parallel. What we’re doing is a very different kind of approach where we have slots and chambers, and it’s fairly easy to get to a much higher level of parallelism.”

The real cost of test
Calculating the true test of cost involves both capital expenditures and operational spending. So a massively parallel tester potentially can replace dozens of other system-on-a-chip test systems, which means less test-floor space and fewer operators. But there is still a perception that system-level test is too expensive, even if there is a growing recognition that it is becoming increasingly necessary.

“The devices that people are trying to test are becoming much more complicated,” Bhalla says. “If you take a look at a processor for a phone, for example, the number of functions that that device has to do and simulate, from taking a phone call to sending text messages to doing stuff on your phone on the Web—the number of complex occasions that can happen on a single device at any one time is becoming very complex. It’s becoming very difficult to test on an ATE pattern. What we’re seeing in system-level test is being able to test the device in its native environment. You can test the high-speed channels. You can usually write patterns that can test multiple functions at the same time. And you start finding that the value of good testing earlier is going to have a large impact, in a positive sense, on your bottom line.”

Quality control is especially important for chips used for safety-critical applications, such as in the automotive and medical markets. System-level testing allows for thermal testing of auto ICs, for example, to see how they perform in hot and cold environments.

Parallel testing is well understood in the analog market, according to Joey Tun, principal market development manager for Automated Test at National Instruments. NI is focused on analog and mixed-signal chip testing; RF ICs and MEMS sensors are the company’s specialty.

“In RF, what we do is not necessarily parallel,” says Tun. “It’s difficult to do parallel test with RF.” Still, up to eight parallel test sites are typically accommodated with NI’s analog and mixed signal test system.

Screen Shot 2017-04-11 at 8.39.51 PM
Fig. 2: Testing RF ICs. Source: NI

“At the opposite end is MEMS,” Tun continues, with strip handling involved for parallel test. MEMS devices can be tested in parallel in greater numbers than RF ICs, from a few dozen to a few hundred in years past, to about the mid-200s today.

Parallel testing isn’t just a numbers game for NI, though. The company is more concerned with parallel test efficiency—how chip handlers and testers are combined to provide the greatest efficiency. In fact, 9 out of 10 NI customers interested in chip testing will choose NI’s test system, while others build their own using NI’s software and instrumentation.

“Software remains a major challenge for parallel testing,” Tun says. “It can be daunting for parallel testing.”

Scott West, Advantest’s product marketing manager for system-level test, talks about his company’s capabilities in testing solid-state drives, the data-storage devices based upon NAND flash memory chips, as an example of system-level test. After years of slow adoption by consumers and enterprises, the demand for SSDs has skyrocketed, causing supply shortages of NAND flash memories and SSDs for enterprise data centers over the past few months.

“In terms of SSD, there’s something interesting things going on there now,” West says. “SSDs have been around for a while, but in very low volume. It’s been kind of a little bit of a Wild West. If you look at it 10 years ago or so, everyone was trying to figure out what SSDs are going to be. And the test equipment has come like it always does at first. It’s sort of bench equipment that gets adapted, and then for system-level test it tends to take the end use for the device and use that to test it—so motherboard-based testing. You set up a motherboard, run special homegrown software to just make sure the device works when you plug it into a computer. That gets trimmed down to being the same structure aimed at test, but now instead of being a full PC, it becomes just a motherboard and special software. But that doesn’t scale well as the market really grows.”

The SSD market is expanding rapidly. “Bauds are increasing dramatically, so the practicality of having a motherboard testing each device, or even several devices connected to a single motherboard, becomes a real leap,” he says. “The software behind it doesn’t work. It doesn’t scale well. That’s where the automated test equipment industry comes in and plays a role. We’re able to make more customized software. It’s no longer the motherboard PC. It’s just really aimed at test, providing all the resources, the capability to test, that’s needed, in a compact form. It’s just what’s needed for test.”

West echoes the sentiment about reducing the size of the test floor. “Your floor space, of course, has a cost to it.” So there’s the “tester per DUT” cost to calculate, implementing an architecture that can even out the costs of high-volume testing with a mixture of chips, he adds.

Design-for-test technology in SSDs is “not quite there yet,” the Advantest executive observes. “So what does this device need? It’s not that there is an overall ecosystem. I do expect that over the next decade or so for the system-level test. There will be more integration. It will be the equivalent of design for test, working with the system-level test as opposed to the chip level, which is already in there. At the chip level, nobody would propose a test that the ATE doesn’t already do, because those conversations are already happening all the way through the device design. And that’s not really there yet at the SSD level.”

While there is a level of technical cooperation between ATE vendors and their customers, there are limits to how far this relationship goes. “The barrier right now is everybody has their own secret sauce,” West adds. “That’s the thing that will have to come down. Something that’s unique to SSDs versus chip test again. Data security is a big part of this. You can’t risk your IP leaking out there.”

Advantest has a stable base of customers outside of system-level test. The company has been talking to the same base for about two decades. SSD is a whole different story, though. “Our oldest SSD relationships are less than five years,” West says, adding the SSD manufacturing divisions may be within Intel or Samsung and often comprised of different people than the chip-testing staff.

Michael Frazier, Xcerra’s senior director for global business development, once worked as a test engineer at Advanced Micro Devices. Single-site testing was the order of the day back when AMD was making and testing its own chips.

Multi-site testing has since overtaken single site, of course. While multi-site parallel testing can turn out more units per hour, such test cells are more expensive to equip and operate, with the addition of contactors and load boards, according to Frazier.

(Xcerra this week agreed to be acquired for about $580 million in cash by Unic Capital Management, an affiliate of Sino IC Capital. The transaction is expected to close by the end of the year and will be subject to a review by the Committee on Foreign Investment in the United States due to Sino IC Capital’s ties to banks and the government in the People’s Republic of China.)

Counting up the minutes
Test time is another significant factor. “Indexing 32 sites can take a long time,” Frazier says. And then turret handlers are necessary for testing very small devices, such as diodes and buffer chips, he adds.

Massively parallel architectures are called for in system-level testing. That’s another cost issue.

“System-level test, in general, is viewed in the industry as very, very expensive,” he notes. “Here are the issues with system-level test. You’re typically dealing with a more rack-and-stack, non-automated solution, so the throughput in the system-level test tends to be slower. Sometimes you’re dealing with modules or larger assemblies, so the automated handling of that becomes more challenging. People are looking to change their strategy on system-level test and move it more to an automated solution, closer to the front end of the process. The obvious issue with system-level test and finding rejects there. The later in the process you detect a reject, the more money you’ve spent.”

With consumer products, such as smartphones, “that’s the last place you want to detect a reject,” he says. Manufacturers really do not like system-level test, because it is slow, it is expensive, it is not very automated.

Their answer is to find chip defects earlier in the production process.

“The handler industry was always a bit ahead of the parallel test capabilities,” says Andreas Nagy, Xcerra’s senior director of marketing, HG and TCI Operations. “For regular ICs, we were on the handler side, always increasing parallelism from single-site to dual, quad, multiple sites, 16 to 32. We recognize that the market is still behind from some less physical applications. The majority was still way below parallel.”

ATE is finally catching up with test handling technology, Nagy says. “Some customers want to use equipment for system-level test. But there is no firm standard definition for system-level test. If you talk to 10 customers, you’ll get 10 different answers.”

Gabriela Born, director of InMEMS and IoT Products at Xcerra, says cost of test, throughput and other factors are elements of parallel test. The tester company looks to overall equipment efficiency, including downtime for maintenance or repair, as the key metric for parallel testing.

Since Xcerra acquired Multitest in 2013, the company has emphasized “Test Cell Innovation” to its customers, Nagy says. The goals include reducing the number of chips that must be scrapped owing to defects. “Testing higher parallel is, in many cases, improving your cost of test. If your total test time is also not increasing in parallel, at a similar level than what you were before you increased, you get a good benefit out of it. But parallelism by itself is not the answer. The other factors need to increase, as well.”

Related Stories
2.5D Adds Test Challenges
Advanced packaging issues in testing interposers, TSVs.
How Testing MEMS, Sensors Is Different
These devices require more than an electrical input and output.
MEMS: Improving Cost And Yield
Second in a series: New packaging options could help boost profitability, but testing and thermal issues remain problematic.