Reliability requires different parts to work in sync, and much more time-consuming testing and simulation.
Demands by automakers for zero defects over 18 years are colliding with real-world limitations of testing complex circuitry and interactions, and they are exposing a fundamental disconnect between mechanical and electronic expectations that could be very expensive to fix.
This is especially apparent at leading-edge nodes, where much of the logic is being developed for AI systems and image sensing. While existing equipment for wafer, die and package inspection works well enough for most applications all the way down to 7nm, automakers’ demands that chips remain functional for 18 years under harsh road conditions is a time-consuming process. So while 99% sampling may be good enough for a smart phone, it is not good enough for safety-critical functions.
To make matters worse, automotive testing often requires synchronization between different components, both within and outside of a vehicle, and much more insight into where potential problems can arise. This is no longer just about using an automated test equipment (ATE) machine in a flow to sample a certain percentage of dies and wafers.
“If you want to get to 18 years defect-free, you have to do extensive testing at the packaging and the wafer level, but you have to develop everything at the system level,” said Keith Schaub, vice president of U.S. applied research and technology business development at Advantest. “Now we’re talking about system-level testing and end-level testing. The way companies do that is rudimentary now. It’s a system-level test, but it’s a system by itself. So let’s say it’s an application processor. The whole thing boots up, and there is a series of test that were defined by the manufacturer, and it runs those tests. There is a bunch of stuff out on the board. So it’s talking to the LDDR5 and maybe another processor and some other memories, and it’s using all of these interfaces, like HDMI.”
What’s needed to make sure everything is working and fully compatible is testing in relation to other components within a vehicle, and even other vehicles from different manufacturers.
“Now you have to have those systems communicate among themselves and make some pass-fail decisions based upon whatever test conditions get set up,” Schaub said. “And that would all have to be controlled and organized by some master planner, and there would have to be some master timing that would be critically important as part of the backbone infrastructure of the system. No system like that exists today.”
Use models are changing, as well. Car ownership is expected to be less important, particularly in urban areas, than the ability to call up a car anywhere and anytime. That means that rather than simulating and testing of vehicles that are used less than 5% of the time, those simulations and tests need to involve multiple use models, including robo-taxis.
“We never had vehicles that drive are intended to drive 200,000 miles in a short period of time,” said Jamie Smith, director of global automotive strategy at National Instruments. “So while companies today are testing autonomous vehicles, ideally, what they’ll be doing is building out their neural network and training it with a wide range of annotated data, and then they will deploy that AI system to test the perceptions. And that will be done in a purely simulated space for awhile to run it through a wide range of scenarios. Oftentimes, people talk about miles driven, but it’s really scenarios that we have to worry about. Once they’re deployed in these simulated environments, then there needs to be a transition to a laboratory environment, and then finally to the test track or controlled road experiments.”
This is the source of another disconnect. “Today, a lot of companies are trying to be first to market so they go right from simulation to the road and cut out the laboratory,” Smith said. We’re testing the AI perception engine—the algorithms and the whole neural network—in the simulated space and the laboratory and on the road to make sure that element is working correctly. But what happens when there’s another free-thinking eco-car out there, outside of the domain for a controlled scenario? Many of the traffic simulators are trying to show what happens when cars aren’t in the center of the lane or when they are driving erratically. That helps to increase the test coverage.”
Random failures, latent defects
Random failures and latent defects add their own challenges, and the problem is worse in automotive due to both the expected lifespan of systems and the harsh environmental conditions.
“You can’t test for the random failures,” said Gert Jørgensen, vice president of marketing at Delta Microelectronics. “You go through this screening test of devices and expose them, and if they pass a lot of acceptance tests, which take 128 hours—that’s a week—and you’ve said that they pass, you’re judged them as good devices. And if a failure happens out there in the field, of course, we do a failure analysis. I know the car manufacturers are registering all failures to see if each one is a periodic failure or a random failure. They have fast reporting systems, so that when we have found the failure they will detect if it has influence on the rest of the population. If they say okay, this is a random failure, we will store it and see if there’s more coming. If it’s a failure that can be cured, they usually do something about it.”
True random failures are rare. A stray alpha particle hitting a circuit and causing damage is known to happen, and the chances of that occurring increase with denser circuits and thinner insulation. So a single-event upset affecting 7nm device with finFETs packed tightly together is more likely than at 28nm. The same is true for random contaminants, which may affect one part differently than another. But delineating which failures are truly random from those that are not is time-consuming, and that adds to the cost and slows down time to market.
“There are quality measurements on how to deal with random failures and procedures and which data should store be stored during car manufacturing,” Jørgensen said. “Carmakers know exactly when a device fails out there, when it was produced, how it was produced, which person was involved, et cetera. So everything is logged and registered even to the level an airplane. They are logging each screw when they manufacture the car. It’s stored in a database of how the car was manufactured.”
But what’s done that data can vary, depending upon the manufacturer, the criticality of the part, and the overall cost of either preventing potential problems or ignoring them.
“If you think about the bathtub curve, there are the latent defects, wear-out failures and random failures,” said Jay Rathert, senior director of strategic collaborations at KLA. “When you merge them together you get the bathtub curve. When we started in this, all of the focus was on the latent defects. Increasingly, people are focusing on the wear-out failures. But there also is a continuum of culture behind this. One German carmaker said that when there’s a failure, the driver sees their ‘badge’ on the hood of the car, not the name of the Tier 1 supplier. That makes them very hands on. They’ve hired semiconductor engineers with automotive background, and they do hands-on audits through the whole supply chain. Increasingly we’re seeing more onerous audits, growing supplier quality teams and increasing levels of expertise because they recognize they have to list everybody’s quality level. Just writing in a requirement may not be enough.”
Latent defects remain a major problem because they’re not easy to diagnose.
“There is no magic to latent defects,” Rathert said. “These are the same kinds of failures that kill yield. They just happen to be the right size and the right location. If it’s too big, you’ll find that at test or probe, and that die will never make it into the supply chain. If it’s too small, it becomes a non-killer defect—it doesn’t matter now and it will never matter. But if you get the one that’s half the design rule, breaks the line partially, over time these get activated in the field by current surges, on-off cycles, electromigration, temperature, humidity. That will eventually create a full open.”
Fig. 1: Latent defects. Source: KLA
Synchronization issues
Perhaps even more difficult to detect are problems that involve multiple systems, or multiple units within a package. This is a relatively isolated issue in most electronics. But with increasing autonomy in mobile devices, particularly where safety is an issue, interactions are critical. And in assisted and autonomous cars, all of this needs to be done at high speed in unknown conditions.
This is one of the reasons why there is redundancy in some of the critical technology, such as image sensing cameras, radar and LiDAR.
“There is huge liability behind a product,” said Uzi Baruch, vice president and general manager of the automotive business unit at Optimal Plus. “But if you look at the step before that, these guys are losing a lot of money building a product. They’re trying to shorten the new product introduction process. They’re getting bids from OEMs with different timelines, and those timelines are sometimes three or four times longer than the original duration. Six months out, you’re ending up with 20 slots. And they lose a lot of money in between. It’s also about the materials. They get the sensor from one vendor, the processor from someone else, the lenses from a third vendor, and then they become an integration point. Just trying to assemble the camera itself is becoming a huge cost issue for them.”
It gets much more complicated when those cameras are paired with other cameras in two- and three-camera modules.
“We’re assessing data coming from the lenses and how that impacts post glue-test,” said Baruch. Once they glue the final assembly on the module, it’s done. But what are the chances of something going wrong in the process? They are shipping those systems today, but the price to pay to get those out the door is high. Imagine 40% scrap. So 60% still goes out, but at what cost and how fast? Sometimes there are dual cameras. You need to match two different camera modules that behave the same in terms of their alignment and their focus. We apply a scoring mechanism based on the actual measurements. In cameras you measure the spread of light over a CMOS sensor. If you know how the module behaves exactly, then you can start matching modules. If you take these two lenses, they might get 70% from a scoring standpoint. But you want the two that are the same, not a 90% and 50%.”
That adds multiple steps in the testing and analysis, and it’s not something that automakers or chipmakers typically have dealt with in the past, or which they were willing to pay for. And it’s not just for image sensors. Power modules need to be synchronized as well, and in the future all of these systems may need to be compared to other vehicles on the road to ensure that systems developed by different manufacturers can work in sync.
“If you stack those lenses up and group their spread of light on the CMOS sensor, suddenly you can see all sorts of issues,” said Baruch. “You see things on the edges you may not have noticed, and then you go back and apply it to design, because the volume of the analysis gives you something you didn’t notice when you look at them one by one. What you can’t see without data analysis is what happens after the lenses are glued, or how they compare to the performance of other lenses. The manufacturers have one model that they align, so they cannot see the big picture.”
Similar types of issues are showing up in other parts of the supply chain. “There are two levels of problems,” said Ron Press, technology enablement director at Mentor, a Siemens Business. “One is that everything is getting more complex, so everything needs to be more precise at the lowest level to catch defects. The second involves in-test operation. All automotive companies have in-system test, which is their mission mode. That provides precise defect detection. So to ensure the reliability of a product, you have test tools that work with reliability monitors, and you have a continuous check of the product. One of the big challenges, though, is the number of duplicate blocks.”
This is particularly true for AI systems within a vehicle. “When you’re looking at automotive grade, there are subtle defects,” said Press. “At 7nm, you’ve got one type of defect test, but it’s difficult to see those defects when the cells are packed in next to each other. So you need to use simulation with that. And you also need to recognize that over a long lifetime, something will break down and need to be rerouted.”
Better data, bloated software, and the future
All of these issues are just scratching the surface of test issues that are starting to come to light in the automotive space. What’s clear is that data will play a much bigger role in automotive reliability, and that data will come from a variety of systems within the car as well as from the fab.
“Historically, the automotive industry would use older nodes for higher reliability,” said Tomasz Brozek, technical fellow at PDF Solutions. “All of this has changed. New nodes are now required, and the driver for that is low, low power for compute. You cannot operate at high voltage because you cannot deal with the heat. So now we need to use different approaches.”
Those approaches involve both in-circuit sensing and AI for a variety of analytics. “As an industry we already are using data for things like binning and risk assessment and test,” said Brozek. “But if you can put structures on the silicon you can monitor degradation rates and see whether that matches your degradation models.”
That’s an important piece, but even the software is becoming more complicated, and with continual updates and patches it can bog down the same as any computer.
“With any software system, the software tends to clutter,” said NI’s Smith. “This is one of the unintended consequences, as companies decide they want to do more multimedia and more direct marketing to the people in the cars. These systems today start with the sensor and the sensor’s ability to bring information into that physical domain to the cyberphysical interface. Our biggest challenge today is testing sensors. When you test a radar sensor or a sensor, not only are you testing the electrical characteristics. You’re also testing the software that’s running on that sensor, and all of that may vary from SKU (stock-keeping unit) to SKU, even if it’s operating within acceptable parameters. So this is going to vary from one vehicle to the next even when the car first rolls off the assembly line. These things need to be validated with a model of that sensor in that simulated environment. But you also need to do over-the-air testing of the cameras, the radar, the ultrasound and the LiDAR in the lab, and how the input from those sensors impacts the entire autonomous system.”
This is a far different testing world than automakers or chipmakers have experienced in the past, and they both are approaching it from different directions. As a result, this may take some time to sort out, and even longer to establish real-world proof points that what is being developed today will truly be working as expected 18 years from now.
Related Stories
Automotive, AI Drive Big Changes In Test
DFT strategies are becoming intertwined with design strategies at the beginning of the design process.
Chasing Reliability In Automotive Electronics
Supply chain changes, resistance to sharing data and technology unknowns add up to continued uncertainty.
Auto Chip Test Getting Harder
Each new level of assistance and autonomy adds new requirements and problems, some of which don’t have viable solutions today.
Reliability Becomes The Top Concern In Automotive
Extended lifetimes and advanced-node designs are driving new approaches, but not everything is going smoothly.
Auto Chip Design, Test Changes Ahead
Which tools and methodologies will work best to ensure electronics operate for extended periods of time under harsh conditions?
Shedding Pounds In Automotive Electronics
Weight is suddenly a major concern for carmakers, but slimming down has repercussions.
So true. Then mission profiles on top on that will enhance such type of defects. Reliability risk analysis methodology needs to be constructed.