A design for safety methodology is essential to trim automotive design costs, but at this point it’s a work in progress.
Designing chips for the automotive market is adding significant overhead, particularly for chips with stringent safety requirements.
On the verification side it could result in an additional 6 to 12 months of work. On the design side, developing the same processor in the mobile market would take 6 fewer man months. And when it comes to complex electronic control units (ECUs) or SoCs, the difference is more dramatic.
Automotive chips must combine all of the reports on failure modes, effects, and diagnostic analysis (FMEDA). That adds significant effort to the design team’s workload. It also slows down time to profitability.
“If you’re not thinking about it from the beginning, it’s even harder,” said Angela Raucher, product line manager for ARC EM processors at Synopsys. “If chipmakers with commercial or consumer-based SoCs want to move into automotive—and they’ve used processor IP that has no ECC (error correcting code)—they have to add a wrapper around that. Further, taking a standard microcontroller off the market and using it in for automotive applications could take two to three years.”
This is where a design for safety (DFS) methodology fits in. While this has existed in safety-critical markets, it is relatively new in the automotive world because until the advent of driver-assisted and autonomous driving, advanced chips were generally confined to infotainment. That has changed significantly over the past couple of years, and DFS is now beginning to show up from the subsystem level all the way up to the system level.
“In RF or radar technologies, that becomes much more difficult because the physical interfaces are really challenging,” said Luke Schreier, director of automated test marketing at National Instruments. “Sometimes that requires over-the-air updates or complicated antennas or channel models, or things that are extremely processor-intensive themselves — a lot of digital signal processing. Those algorithms are being asked to do a ton of the heavy lifting, which speaks to the trend of putting as much of the RF into CMOS as possible—or the general trend of digitizing as much analog as possible. You expect to get imperfections in the silicon by choosing some of those other processes, and then you compensate or correct for it.”
The company has been involved in general automotive vehicle test and hardware-in-the-loop technology for some time, where some portion of the world is emulated to in order to avoid expensive and time-consuming drive testing.
“When you have that kind of methodology—with algorithms that have to be borderline amazing when you think through the patterns and the situations that you’d have to test for in approaching an intersection, trying to not hit a person in a self-driving car—you can only accomplish so much at the subsystem level. You have to bake it up to the system level. But depending on whether you’re an OEM, or the silicon provider, you need a hierarchical relationship between those types of tests so you can leverage as much from one to the other. You don’t want the OEM having one methodology that’s completely different from the vehicle manufacturer. Likewise, as much of that can be passed on to the silicon for validation, the better,” Schreier explained.
Many different test capabilities are needed, rather than just blasting vectors or even protocol-aware methodologies. The reason is the need to intermix a lot of physical stimulus just to be able to emulate the types of signals and conditions that that IC may be expecting.
“It may be expecting a camera input or all of these physical sensors inputs,” Schreier said. “Then you spend as much time testing the algorithm as you do the physical part of the chip. This complexity around sensors, RF, and analog is creating a completely different dimension of complexity. Even if you can scale to transistor counts, being able to mix all these technologies and test requirements together in one package is pretty remarkable.”
What else does it do?
A key consideration when developing safety-related systems is minimizing the risk of unintended behavior. This is a serious concern in the automotive world, and it adds to the complexity of the whole verification process
“It’s never possible to eliminate risk entirely, but it’s essential to reduce that risk to an acceptable level given the potential consequences,” said Paul Black, senior product manager for compilers at ARM. “This dictates that the tools used during development must be designed, developed, validated and maintained under regimes that strive to maximize predictability of functionality and minimization of risk.”
For software tools, this typically includes mapping the tool development process closely to a reference ‘V model,’ meant to ensure reliable and dependable operation through structured requirements, defect management, as well as comprehensive test and validation processes. And all of that needs to be mapped directly to architecture, module design, and integration activities, he said.
A lot of these methodologies are still works in progress. There currently isn’t one path that developers are taking on the hardware side, one large EDA company they work with, or a single way to use all of the tools, said Robert Bates, chief safety officer in the embedded systems division at Mentor Graphics.
The same is true on the software side, he said. “There are a lot of ways to get there, and tooling is almost necessary to enable them. If you’re looking at something like autonomous driving, the software is so complex, and it’s one step removed from what it’s actually doing. If I’ve got a neural network or some other kind of machine learning system, and the system is focused on the machine learning aspect of it, the safety becomes even more difficult to manage. You have to control both the underlying quality of the implementation, as well as the underlying quality of the data that you use to train it.”
Most of that can’t be accomplished without at least some degree of tooling because it gets too big for humans to deal with, Bates pointed out. “The problem is that a lot of the hardware tooling out there is rudimentary. For the software, the tooling is also really rudimentary. Static analysis, when I compare that to the capabilities on the EDA side, it’s like apples and kumquats. In that sense, there’s no one path because no one has come up with the one path.”
Other industries and parts of the automotive industry have leveraged their modeling environment as an extension of their software development environment, he said. “Simulink [from MathWorks], which is used for all of their simulation, is a much more rigorous methodology. But the problem there is that you can’t develop fundamental pieces of the application in that kind of environment. I’ve never seen a good tool, for example, to model an operating system. Those are invariably handwritten, whether it’s AUTOSAR or a regular RTOS or something like Linux. And once you start handwriting, it opens up a whole can of worms. But for the application level, which is what the Tier 1s and OEMs try to think about when they think about safety, they are in a better starting point because the modeling tools give them that level of rigor.”
Shorter driving time
All of this is done so systems companies don’t have to drive millions of miles to prove they are safe.
“With simulated or emulated environments, even if it is something as ‘simple’ as pure software, there’s nothing stopping a company like GM from setting up a thousand servers and not having to drive a billion miles,” Bates said. “But in their simulated environments they can easily get billions of miles over the course of a year or two so that they can actually refine their safety strategy, both from a software and hardware perspective, as they are also getting the real-world experience that’s necessary to properly train the AIs.”
This is all part of the DFS methodology, and it meshes with other trends in system design—the whole shift left concept—to run multiple steps concurrently in order to speed up time to market.
“Generally, it’s understood that we need to add diagnostics into a system to reduce the failure in time (FIT) rates, said Adam Sherer, verification product management director at Cadence. “But the exact set of methods, where they need to be placed, the extent to which they need to reduce that FIT, is still subject to each company’s own methodology right now. I don’t see a consistency, which is okay. We have companies that are building ASIL-B, ASIL-D components, following their internal methods, and safety has been part of the industry since it has been an industry. So some of those elements are consistent and true. What is beginning to emerge, though, is where to place the diagnostics and the extent to which they are measured. That’s what the industry is currently exploring, and we’re going to see more of that begin to coalesce with the next ISO 26262 standard update [due sometime in early 2018]. But it will be in progress for some time.”
Another piece of this is planning, implementation and signoff of those systems.
“When it comes to the diagnostics, having an implementation platform that is integrated is absolutely critical, so you’re not handing files off in between different tools,” said Rob Knoth, product management director of the Digital & Signoff Group at Cadence. “You’re not adding areas where user error can insert bugs into the chip. Gone are the days of the ‘rainbow flow,’ where CAD departments would pick absolutely the best in breed for every single tool, regardless of what vendor it was, and then have endless, complicated scripts to stitch those tools together. We’re definitely seeing more customers moving away from that mentality, first because of EDA consolidation, but also because it’s so much less error-prone to have integrated tools and flows with verification, and implementation talking to each other working with emulation boxes. You want to have one story rather than a piecemeal tool flow and tool methodology that’s buggy.”
Technically speaking, there are two halves to a DFS methodology—systematic and random.
“The systematic half is essentially classic verification as we understand it today,” said Dave Kelf, vice president of marketing at OneSpin Solutions. “It’s the testing of the functionality of a design to make sure it has done the job right, tool flows work correctly, and so on. For automotive, it obviously has to be much more rigorous than regular verification because 100% coverage is required. The coverage has to be right up to the requirements for the original device, so it’s not enough to say, ‘Here’s a spec, let’s make sure this spec is implemented correctly.’ You have to do that, but you have to go beyond that and identify the basic requirements for the whole device in the first place, and determine if those requirements have been met correctly.”
Kelf explained that the systematic requirements for the flow are multi-staged and multi-layered, where you start from the requirements, figure out what the implementation of each one of those requirements is, and determine the implementation plan for that verification and how the verification plan is addressed by the different verification tools. At the end, the coverage is gathered together and tested, not just against the verification plan but right back up to the requirements. “What they are doing is breaking the requirements up to test them individually. That’s different than a regular verification environment where you might say, ‘Here’s the functionality of the whole chip, let’s write some tests that test that whole functionality.’ Here, you’ve got to write a bunch of tests to make sure that one specific part of the design does one specific thing. Then you write a whole separate, independent set of tests to do another particular test so you can follow and monitor them.”
Kelf added that while many engineering teams still use simulation today, the mapping requirements to UVM tests are fairly difficult. “You have to come up with a test set, and the problem is if you’ve got a chip, you’ve got to get that chip into a certain state to start running the test. To get it into a state, quite often you’ve got to run lots of tests. So if you’re breaking these requirements up into individual groups, for each one of these you’ve got to run the chip up to a certain state and then start doing the testing. It adds a lot more to the burden of testing. There’s also a lot of extra stuff that goes into the UVM testbench, which isn’t part of the requirement.”
Formal verification technology can play a role here because requirements can be mapped to assertions much more easily, he said. “You can say, ‘We’ve got this requirement—this device has got to do something in this timeframe, and make sure it never does this, this or this.’ You might have something like that, which can map to an assertion directly. When you’ve got an assertion, you don’t have to run the design up into a certain state. You can just go in and ask if it will ever happen.”
In addition to the systematic side, there is the random side, which tests to make sure that if a fault does get introduced to the chip during its operation somehow. That could include electromigration or radiation bit flip inside a memory, for instance. The chip should be able to recover and continue operating. Error-correcting algorithms, and safety-handling mechanisms are added into the chip to do this — all of which must be verified.
“Many companies are running fault simulation to do this, but it takes a long time to run, and the hardest faults to prove are the ones that might dissipate into the logic, and never make it to a hardware-handling mechanism,” Kelf said. “These are good faults, but you can’t prove that because the fault simulator never stops them. It never makes it to the handling mechanism. Here, you need to prune out those faults. That is often done by hand, which is extremely tedious and a waste of engineering talent.”
Conclusion
Creating a Design For Safety methodology is a complicated process. It varies from company to company, and frequently from one design to the next.
“It varies a lot depending on how much experience they have in the market,” said Raucher. “If they have experience in the market, they’ve always been thinking about it. They have looked for high quality vendors and focused on doing this random fault verification, so they have tools that help them do that. Some design teams that are new to automotive, moving from either consumer or enterprise-based products, have to rely a lot on recommendations from trusted partners.”
And because the automotive market is all about competitiveness, time-to-market pressure is extreme. Especially with techniques like fault simulation taking weeks to run, the automotive OEMs are trying to find ways to speed this up.
It will take some time before a standardized DFS methodology emerges—and that will only happen if there is enough pressure applied across the automotive ecosystem. In the meantime, companies are just beginning to wrap their heads around what a DFS flow really entails and why it’s important.
Related Stories
The Higher Cost Of Automotive
Suppliers looking to enter this market pay a premium in design time, certification and verification requirements.
Rethinking Verification For Cars (Part 2)
Why economies of scale don’t work in safety-critical markets.
Lawyers, Insurance And Self-Driving Cars
Semiconductor companies have to contend with more than technology when it comes to the automotive market.
What Can Go Wrong In Automotive
Experts at the Table, part 2: Understanding security risks, ECUs vs. SoCs; dealing with an explosion in data.
Leave a Reply