First of two parts: How the car industry can improve reliability.
As the amount of electronic content in a car increases, so does the number of questions about how to improve reliability of those systems.
Unlike an IoT device, which is expected last a couple of years, automotive electronics fall into a class of safety-critical devices. There are standards for verifying these devices, new test methodologies, and there is far more scrutiny about how all of this happens.
“We are moving from ADAS to autopilot, from autopilot to autonomous driving,” said Anush Mohandass, vice president of marketing and business development at NetSpeed Systems. “A device in the car has to be intelligent because the time it has to make these intelligent decisions is milliseconds to microseconds. It doesn’t have a luxury to communicate back to a data center, process it, and give it back. It probably does that at the end of the day to train its algorithms and to make it better decisions, but when it’s interacting with consumer it has to do it instantaneously.”
The key term here is fail-safe verification. “Even if it fails, things still have to function,” Mohandass said. “Verification has been about whether you have redundancies, and whether you understand the faults in your system, and how to analyze that. Verification must become an exercise in whether you understand all the different mechanisms in which faults occur, and how the system reacts to these faults. We aren’t there yet. The General Motors and Volkswagens of the world have understood this for years and they have systems that look at this. In the semiconductor space, we used to think a chip would be operational for three or five years — maximum seven years. If it kind of works for the first three years, great. And if it works in a limited way for five to seven years, we were okay with that, too. But even if a single part works that way, the entire system cannot work that way. That mentality is what is missing in verification in the automotive space.”
What’s inside the box?
, CEO of Imperas, said one problem is that Tier 1 suppliers developing the electronic/software components that the car manufacturers integrate are all developing black boxes. “The integrators don’t really know what’s inside them, so they have to live with them. The problem is those black boxes last for years and years. With technology and end user requirements advancing rapidly, this has become almost unacceptable to the consumers, as it takes so long for all of this to happen. New features and capabilities are just too slow emerging into the markets. A simple example is all the new ADAS technology, which is a combination of hardware and software, but systems being developed now probably won’t be in the market in cars for maybe five years.”
Those black boxes include such technologies as driver assist, backup cameras, blind-spot detection, active cruise control and automatic steering. “Whether it’s a very low power processor on the edge of the IoT, or whether it is an ECU in a braking system in a car, it’s pretty sophisticated processing,” Davidmann said. “It might have low power, it might have high performance needs, but basically, more and more of them are not just hardware. They are hardware with an awful lot of software in it.”
Verification to the rescue
To improve time to market, suppliers are looking to use fewer hardware prototypes and more simulation technologies. Prototypes are too slow and there are never enough of them. On top of that, with an almost constant barrage of software changes, it’s hard to keep those prototypes up to date.
The ISO 26262 standard is shaking things up with requirements such as showing — depending on the ASIL level of the end product, which could be braking, infotainment or seat control — that the software is tolerant of all sorts of potential problems, such as single-event upsets. SEUs have long been a concern in the mil/aero market, where stray alpha particles or electrons can cause soft errors.
“The way that the industry is attacking that—and this is verification because you have to verify that your software is tolerant, whether it is because you’ve used redundancy in the hardware, or whether you’ve used clever design in the software or whatever—is with fault simulation coming back to help with this verification,” Davidmann said. “Fault simulators were great for netlists and gates, but moving to RTL and formal verification has removed the need for gate-level fault simulation.”
At the same time, automotive reliability requirements are making it more difficult for companies to develop products once and use them across multiple markets. A device that works fine in the IoT space is subject to completely different regulations in the automotive world.
“These components must adhere to the same regulations as other automotive devices,” said David Kelf, vice president of marketing at OneSpin Solutions. “Many of these sensors must be fully inclusive and communicate wirelessly, driving low-power verification requirements. They are often hard to replace, so high reliability becomes a factor. They often are delivered as low-cost modules in high volumes, which means standard-part verification practices become important.”
All of this is part of the push for safer vehicles, which in the case of cars is driven by liability and insurance, said John Brennan, product management director at Cadence. “If you don’t have a safe vehicle you are in trouble. If you look at the ISO 26262 standard, it goes to great lengths to explain what you need to do to ensure the integrity of the design process, the verification process, and even the tools tool suitability selection process. It is very extensive. It’s going to change everything.”
Looking differently at verification and test
Most of the testing done in the IoT space focuses on what’s known as positive testing, which determines whether a design has met certain specifications, then running tests to prove that. Automotive design takes a contrary approach, looking at what happens if something fails.
“What happens if the electronic system to the steering wheel fails, or that one bit gets out of place — it could be anything,” Brennan said. “Now we are responsible for ensuring the integrity of the design during things that are unexpected. And that is where negative testing comes in. Negative testing says if something happens you have a fail-safe. It means that you fail in a safe mode, you fail safely or you have an alternative mechanism to catch that failure. Sometimes it is redundant circuitry. Sometimes it is DRC error checks. And sometimes it’s simple parity checks. There are all these additional things that automotive suppliers are doing to make sure that if there is a failure, it is either absorbed and the user never sees it, or that you fail in a safe mode.”
This has a big impact on verification. “What this means from a verification perspective is that you basically take a design — and this is something that our community really hasn’t done in the past — and identify all of the possible failure modes that the design could go into,” he explained. “Then you emulate or simulate all of those failure modes. You can imagine if you have a giga-gate design, the number of failure modes could be quite large. So you have to be smart about how you go about exercising a design and simulating all those failure modes. A lot of work is being put into trying to figure out how to automate this. How do we make it a repeatable process? How do we know that we’ve identified all of the possible failure modes.”
Liability in the automotive market is a brand new worry for most chipmakers, but the results can be extended to other markets like mobile. Brennan suggested that negative testing can help. “As devices and processed geometries get smaller we are seeing mechanical failures of silicon. The transistors don’t last as long. Bigger transistors last longer, and smaller transistors don’t, and there is less immunity to noise, EMI, and RFI with the closer the lines are together. The consumer companies who are on the bleeding edge of that because they want the smallest possible design have challenges that are not much different than in automotive. You have to ensure the integrity of that design. In consumer markets you face huge recalls. In automotive the risks are just as high. What’s common between them is the cost of failure is too high to not manage the negative side of what can go wrong.”
Safety-critical systems require an extra layer of verification as well as extra layer of sophistication that must be built into the design. Designs are created with a certain failure rate. For a typical SoC used in an automotive application, there must be less than 10 failures in 1 billion operations. That means a chip has to operate 1 billion hours with less than one failure.
Related Stories
Auto Security And Technology Questions Persist
Fallout could be slower adoption of autonomous vehicles as ecosystem proceeds with caution.
Grappling With Auto Security
The search is on for a way to balance connectivity, performance and security.
System-Level Verification Tackles New Role
Experts at the table, part 3: Automotive reliability and coverage; the real value of portable stimulus.
Leave a Reply