Grading Chips For Longer Lifetimes

Different use cases, dependencies and testability are making direct comparisons much more difficult.


Figuring out how to grade chips is becoming much more difficult as these chips are used in applications where they are supposed to last for decades rather than just a couple of years.

During manufacturing, semiconductors typically are run through a battery of tests involving performance and power, and then priced accordingly. But that is no longer a straightforward process for several reasons:

  • At advanced nodes, dielectrics are so thin and structures are so fragile that sticking probes into them or boosting the voltage can cause damage.
  • Chips in packages often are not accessible because there are no exposed leads, which makes them harder to test after the packaging is done — despite the fact that chips can be damaged during the packaging process.
  • It’s not always obvious whether minor defects will cause problems over time. Frequently that depends upon the end application, individual use cases and ambient conditions, all of which can stress a device in different ways.

While understanding how to test these devices needs to be thought out at the very front end of the design cycle, the problems begin in earnest during various testing steps throughout the manufacturing process.

“A lot of this has to do with access,” said Doug Elder, vice president and general manager of the semiconductor business unit at OptimalPlus. “Even where you have access to a device, you only can probe them once. And the cost of devices is getting so high that you need to find different ways to test them. For some devices, the packaging and size are limiting access. With wirebond, which is still the way the bulk of chips are packaged today, it’s hard to get good access, so you have to do a visual inspection. And if you thought 7nm is challenging, at 5nm everyone is shaking their heads. System-level and functional test are becoming more and more difficult.”

In some cases, such as AI chips, it’s not even clear how to compare chips. Algorithms change almost weekly, and AI systems will optimize around different use cases. As a result, performance and power in AI systems may be different from one chip to the next, and they may vary greatly by user and location.

“We’ve run for a long time on leading-edge nodes with a consumer-oriented mindset as far as failure rates in the field. That approach is a little scary for applications that require a higher level of reliability,” said Chet Lenox, senior director of industry and customer collaboration at KLA. “To better serve these applications we’re developing techniques that allow us to utilize data from the fab to help predict reliability in the field. Applying machine learning is, of course, a key part of that. In the past, most inspection and metrology data were used for pure process control in the fab. We could identify excursions, figure out defect paretos, drive down those defects, and keep CDs (critical dimensions) in control, in order to achieve acceptable parametrics at the end of line. The final arbitrator was sort/yield and final package test. What we’re now facing is the fact that final electrical test won’t catch latent reliability failures. We need to use all of the data we have available, including fab inspection and metrology results, in order to find potential reliability issues.”

That makes it much tougher to grade chips based upon their value in the market, and even to predict in some cases how a chip will function over time. One chip may appear faster or lower power than another chip post-manufacturing, but over time the results could change significantly.

“You must have metrology to find the obvious problems,” said Dennis Ciplickas, vice president of advanced solutions at PDF Solutions. “You also test chip failure modes and you do dynamic test, and in manufacturing you loop those together and connect those two. If there are sensors in a die, you can measure these and how they differ from other die. It is possible to have more efficient overall test in-die. That’s more complex, and there is more electrical measurement. But you can use that to understand a chip’s behavior and how to test it.”

Understanding and doing are two different things, though. The testing process itself is getting more complex. “You’re measuring different manufacturing layers of chips, the shapes of the patterns and the current flow and how that is isolated,” said Ciplickas. “You also have active electrical sensors, with individual device structure behaviors. Those can change under different operating conditions. So if you’re looking at 2.5D, you need to look at temperature one way versus another and how that will change over time. This is where we’re seeing in-die sensors, which is a brand new field.”

Testing and inspection issues
Figuring out what to test and when to test it is becoming much more complex for a variety of reasons.

“We are still binning devices based upon test, but there might be processes were you have to retest something 10 times because you are focused on yield rather than quality,” said Patrick Zwegers, business development manager at yieldHUB. “Now you can analyze if it is a process failure or a product failure, and with today’s technology, it’s more and more possible to bin devices based on their behavior over time. We’ve been working with a customer to program a lot of information into devices. We can program information about what happened in testing in the past, too.”

That’s helpful because the tests themselves can cause damage to a chip or a wafer.

“Many manufacturers are keen to do non-contact optical inspection in between processes while the opportunity exists, and correlate that to functional testing, defects and final yields to improve overall yields and productivity,” said Subodh Kulkarni, president and CEO of CyberOptics. “Simulation and destructive testing continue to play their roles in advanced packaging, but with the value of advanced packages going up, destructive testing isn’t desired.”

Heterogeneous integration
Test also becomes much more complicated as more heterogeneous components are crammed onto a chip or into a package.

“Traditionally, wafer test takes place in the middle of the semiconductor process – after the front-end wafer fabrication finishes and before the back-end assembly and packaging,” said Amy Leong, chief marketing officer at FormFactor. “Now, because of the new heterogeneous integration, each wafer is tested more thoroughly. For example, first after wafer fabrication, then after wafer dicing and die stacking, and again on the heterogeneous integrated wafers of various die. To ensure good-enough-die, test content is going up, as well, as evidenced by requirements for at-speed device validation and higher test time.”

The test equipment needed to do these tests has changed, as well. “The probe card, the test interface that connects the ATE (automated test equipment) and the wafers, needs to probe on tiny microbumps of 25 micron diameters and below,” said Leong. “You need high contact precision MEMS probes – many thousands of them placed accurately at 45um distance to each other, with low force not to damage the packaging bumps. These MEMS probes have 1 or 2 grams contact force per probe, compared to the traditional probe force of 10 grams. The biggest challenge there is how to achieve excellent electrical measurements through a stable contact between the probe and various microbump materials such as Copper or SnAg.”

Fig. 1: Microbumps that form TSV interconnects. Source: FormFactor

Another big challenge involves test conditions, whether that’s a single chip or multiple chips in a package. The problem is that a chip needs to be tested in the context of how it is being used. That could be above above 150°C for automotive ICs, and it can require a mmWave test speed for 5G communication ICs.

Another big challenge involves the analog portion of a design, whether that’s a single chip or multiple chips in a package. Like chips in automotive or 5G, anything analog needs to be tested in the context of how it is being used, and it needs to be recalibrated over time because analog circuitry tends to drift.

“There is not a standardized way for achieving test coverage in analog,” said Matthew Knowles, silicon learning product marketing manager at Mentor, a Siemens Business. “With analog, you’re generally looking at historical data and returns. Most times companies are over-testing due to risk aversion, and the digital and analog tools to do that testing are fragmented.”

Partitioning raises another set of challenges, both for the tests being performed and the systems being tested. All of those compute elements need to be synchronized to achieve good results. But it’s getting more difficult to predict imbalances in that partitioning because not all of the data is consistent and structured exactly the same way. That makes it much harder to compare one chip to the next, but it also makes it more difficult to grade them. For example, it’s more important for a automotive camera image sensor to match another image sensor in a two-camera subsystem than to have one that has slightly faster performance.

The same holds true for advanced packaging solutions, where chips can be matched to each based upon their individual performance and power rather than just how close they are to a particular specification. Just because one chip falls into an acceptable distribution of behavior doesn’t mean all of the chips in a package collectively will behave within acceptable parameters.

“There are a lot of different modules that that go into automobiles,” said Rich Rice, senior vice president of business development at ASE. “If you were to take one of those modules and create an SiP that is completely overmolded, and the interconnect uses semiconductor interconnect inside of it, you’re going to get a little bit higher reliability. That’s inherently lower liability, providing the semiconductor interconnect is done properly. For for automotive applications, SiP will offer a more reliable solution going forward.”

Solving a different problem
In conjunction with advanced packaging, more in-circuit and in-field sensors are being used to collect data and help chipmakers and systems vendors how devices age in the field.

The challenge is understanding those devices well enough to make those predictions. “You start with a system that is okay, where you have no failures that you know of,” said Olaf Enge-Rosenblatt, group manager for computational analytics at Fraunhofer IIS‘ Engineering of Adaptive Systems Division. “So you make initial measurements and over time you learn how it will behave and you detect trends and changes in patterns of signals. From there, you use trend analysis to identify anomalies. But the first step is to describe a good system, and that means you not only look at classical measurements like vibration, but you include all conditions and parameters in a particular operation. If a shaft is rotating at a certain speed, if that changes, you see that or interpret what is going on with additional vibration. You can’t always see that with a single vibration source, though. You need to look at a complete system and you look for trends.”

Understanding how all of the pieces go together isn’t always obvious. A possible defect in one chip may cause a total system failure under certain conditions, but go unnoticed throughout its lifetime in a different location or under different use scenarios. It can even vary by testing equipment, where a test that has been performed hundreds of thousands of times may cause problems as that equipment begins aging or if it is run continually in times of tight capacity.

“A lot of our yield wisdom is not applicable to heterogeneous integration,” said Leong. “The traditional defect pattern is from the center of a wafer to the edge, but with a heterogeneous device, this is not necessarily valid as you may have die from different parts of a wafers reassembled into a new wafer. You also may have unconventional wafer topography and thermal environment after four or eight die are stacked. All of these new manufacturing challenges require innovative test and measurement solutions. For example, we recently announced the Altius product that not only tests an IC, but also the silicon interposers that bridge dies in the advanced packages.”

How to determine which one is better than another, or more valuable than another, is far from clear. Complexity is spreading throughout the entire supply chain, and what used to be a rather straightforward comparison is far from straightforward today.

Data Analytics & Test Knowledge Center
Special reports, top stories, white papers, videos and more
Test Costs Spiking
Use of more complex chips in safety- and mission-critical markets is changing pricing formulas for manufacturing.
Leveraging Data In Chipmaking
PDF Solution’s CEO talks about the growing impact of analytics and AI on semiconductor design and manufacturing.
Different Ways To Improve Chip Reliability
Push toward zero defects requires more and different kinds of test in new places.

Leave a Reply

(Note: This name will be displayed publicly)