Adding Safety Into Automotive Design

OEMs are demanding safety-readiness for more components, altering the dynamics of the design process.

popularity

The ISO 26262 spec is a household term for anyone even remotely involved with the automotive industry today. Increasingly, though, it is being used interchangeably with safety-readiness across the entire supply chain.

ISO 26262 compliance is a prerequisite for IP and chips used in an increasing number of automotive applications. It applies to systems, software, and to individual products. And because failover mechanisms are tied to other systems, even non-critical systems are being drawn into ISO 26262 compliance. That requires planning, and a different way of thinking about design.

“You have to think about safety from the beginning,” stressed Lakshmi Mandyam, vice president of automotive at Arm. “Doing more up front with your usage of safe IP, or IP that has the right safety collateral, will simplify your design, as opposed to thinking about it as an afterthought or saying the software guys can take care of it. The more you do up front, the simpler it gets down the value chain. Ultimately, this is where you will be delivering more value to your partners because the time-to-market is essential, especially if you look at applications that are evolving really fast — ADAS, autonomous, IVI (in-vehicle infotainment). It’s not just at the silicon part end, but also at the OEM, Tier 1 and service provider end.”

Given the diversity of the automotive supply chain, OEMs generally capture safety and other requirements in specifications broken down by system.

“For an advanced driver assistance system that’s very much tied into safety, you break it down into the different architectures of the car from the OEM and provide their requirements for each of those architecture subsystems,” said Ron DiGiuseppe, automotive IP segment manager in Synopsys‘ Solutions Group. “Certainly for ADAS, the specs coming down from the automakers are clear. They would either want an ASIL B-level safety compliance for those ADAS systems, or depending on what the actual system is, it could be a higher level of safety such as ASIL D.”

The problem is that not everything gets used in ways that IP vendors expect. “When OEMs say, ‘safety ready’ or ‘ASIL D ready’, or ‘ASIL D capable,’ what it means is if you use this ingredient, it’s going to basically enhance the safety of the overall system that it’s in and it’s going to enhance it by increasing the diagnostic coverage capabilities of the overall system to a certain level,” said Kurt Shuler, vice president of marketing at ArterisIP. “Of course, it’s always up to the system maker, because whether it’s Arm or Arteris or any IP provider, or sometimes even chip providers, you’re designing a product with no idea what the system’s going to be.”

Also, ISO 26262 compliance requires robust computation of several hardware metrics, including single-point fault metric (SPFM), latent fault metric (LFM), and the probabilistic metric for random hardware failures (PMHF), noted Jörg Grosse, product manager functional safety at OneSpin.

“Typically this is done within the FMEDA, which is owned by the functional safety engineers. We are observing confusion and increasing effort when it comes to computing those metrics for SoCs because tasks are pushed across to the functional verification teams without clear methods or tool flows. Massively increased chip size, shrinking geometries, and higher frequencies are imposing new challenges for compliance, such as new classes of failure modes. Innovation and re-assessment of current flows and methods are needed to ensure that crucial resources are effectively used and to prepare for these new challenges,” he said.

Because of this, most of the IP created within the semiconductor industry by the ingredient providers is designated by ISO 26262 as a safety element out of context (SEOOC), specifically because a system technically is the only thing that can have an ASIL level, Shuler noted.

“Interestingly, something that catches people and is something that’s really important is that whether it’s a chip vendor looking at IP providers, a Tier-1 looking at the chip vendor, or an OEM looking at the Tier 1, they’re not just looking at whether this technical widget does what they say it’s going to do. A lot of attention is paid to the analysis of the widget, whether it’s an IP widget, a chip widget, even a system. And that’s where the term failure modes and effects and diagnostic analysis (FMEDA) comes into play. FMEDA quantitatively looks at that.”


Fig. 1: ISO 26262 overview. Source: ISO.org

Methodology counts
Beyond the technical aspects, playing in the automotive world means methods and people are analyzed, as well.

“It’s great if you created this product that has all these technical things,” Shuler said. “But if the designers don’t know anything about safety or engineering and they got the code from a bunch of random different places and bolted it together, that can be really bad. It can cause systematic errors.”

Adhering to tight safety protocols means new questions are asked of the design team in order to deal with the systematic errors. “How do you design quality into the product? How do you design safety into the product? How do you not do things that are going to cause gaps that could lead to a future safety issue? When people start getting assessed within the value chain, it’s usually the person above you. But it also can be a third-party assessor that your customer hires, or that you hired yourself. People are surprised because they spend more time, or as much time, looking at your quality processes and how your people are trained. They’ll go to a verification engineer and ask, ‘What do you do when you have a bug reported to you? How do you know it was closed? How do you know it went into the proper release of this particular thing?’ They’re looking in detail at the people, and whether they can do their jobs, and whether you have a process to enable them to do their job? That is at least half of the assessment. It’s not just looking at the speeds and feeds,” he said.

One of the critical work products that must occur for ISO 26262 compliance is dependent failure analysis, which covers the interactions of the hardware devices with the system, DiGiuseppe noted. “The standard does break it into different hardware components, and the software components that go with those individual components to make up a subsystem. We’re not talking about automotive full systems—that’s the car—but subsystems like ADAS or the infotainment system, or the system and the interaction between different components of a module. Even drilling down to what interaction of different IP blocks within a chip, all of this needs to have what’s called a dependent failure analysis (DFA) to see if there’s a failure in one block, or at a system level, or failure in one chip on a module. If there’s a failure of one particular component, how would that affect the other components in the system? This applies both to IP blocks that are interacting with other IP blocks, or chips that reside on a board.”

Another aspect of safety readiness as dictated by ISO 26262 is decomposition, pointed out Sanjay Pillay, functional safety technologist for ICVS at Mentor, a Siemens business. “If the system is an ASIL D system, it can be decomposed into multiple components, each of which could be somewhere between ASIL B and ASIL D. The OEM wants to have the safety manual for the components, with all the collateral that goes with it, certified as safety ready. What that means for the semiconductor and systems vendors is providing the whole collateral or a framework to deliver that collateral so that when the semiconductor companies go to the OEMs, they have everything they need to be safety ready.”

This is a huge change for new entrants into the automotive space. “There are two very distinct types of customers that we see,” Pillay said. “One are the semiconductor folks who traditionally have been suppliers to the automotive space. They have a reasonably good understanding of the whole process. They have their internal processes aligned. But even for people in that camp, that is a big change that’s coming. Then there are the new players who are getting into the market, and the big disruption that we see is happening because of the change in the requirements from the OEMs in terms of the automobile, namely autonomous driving. This is changing the scope and the rigor required from a safety perspective. The electrification of the automobile is also having an impact.”

Traditionally, automotive components have been microcontrollers that were controlling the engine, the transmission, the emissions control, the body, brakes, things like that, Pillay said. “Take that and go to autonomous driving, and suddenly the scope of the complexity is a few orders of magnitude higher. We went from a few hundred thousand gates in microcontrollers to a few hundred million gates, to now approaching a billion gates on a autonomous driving chip. That change in scope basically means everybody has to start looking at their traditional design methodologies. Adding another layer of complexity is safety, which tries to address random faults, and how a system responds when a random failure happens.”

Another approach to safety is within the cores themselves, Arm’s Mandyam asserted. “We have developed technology that can, with software configuration, either have processors that are split to run a variety of different applications or locked for safety. As an example, on the same platform there are OEMs and Tier 1s who want to deliver applications at different ASIL levels. If you look at an IVI system, it’s very performance-heavy and wants to run many different applications on many different OSes, and we would want to run those. It could be the non-safe or ASIL B applications. Then, if you had a sensor fusion module or an autonomous vehicle controller, you would want to have ASIL D safety capability, and this is where you would want to have processors locked. We have been able to develop the core such that you can run two cores in locked mode, where they’re executing the same instruction. Each operation and each instruction is checked automatically in hardware, cycle by cycle. This gives a higher level of safety without increasing the software complexity.”

Also, safety tools that contain a software test library can be used to detect which processor has an issue. “Then you can easily go into fail-operational mode by flipping the processors to be more available, and they can continue to execute at a lower level of criticality, also known as the ‘limp home’ mode,” she explained. “Let’s say there’s an issue. You can slow down the car or the autonomous system can slow down the car or move to the side of the road, then take action to be able to recover from the failure.”

This is compared with other high performance automotive processing where two cores are running the same software independently, and at the very end they check the results externally on a CPU checker with a watchdog SoC. “This is an inefficient architecture,” Mandyam said. “There is increased software complexity in terms of responsiveness, and it’s a complex software certification and supply chain. More importantly, it uses more board space and has higher power consumption.”

Conclusion
The scope of automotive design that includes safety and reliability has become so big that it must be addressed in a very different way than it has in the past.

“You can’t take traditional verification and development approaches,” said Pillay. “What we have seen is it’s forcing people to innovate in that space. It’s a non-linear scale problem. You can’t take methodologies that worked for a few hundred thousand or even a million gates and scale it to a billion gates by brute force. It’s forcing people to reevaluate and innovate.”



Leave a Reply


(Note: This name will be displayed publicly)