Experts at the Table, part 1: How do automotive notions of safety and security compare to those in avionics?
Semiconductor Engineering sat down to discuss industry attitudes towards safety and security with Dave Kelf, chief marketing officer for Breker Verification; Jacob Wiltgen, solutions architect for functional safety at Mentor, a Siemens Business; David Landoll, solutions architect for OneSpin Solutions; Dennis Ciplickas, vice president of characterization solutions at PDF Solutions; Andrew Dauman, vice president of engineering for Tortuga Logic; and Mike Bartley, chief executive officer for TV&S. What follows are excerpts of that conversation.
SE: More people are talking about safety and security these days, perhaps driven by automotive. This is acting as an awakening to many others. How much is the industry actively looking to improve in these areas rather than just paying lip service?
Landoll: The industry is working on it. I have been involved with avionics DO-254 since 2005. That was when the North American DO-254 user’s group was started. I see a lot of stuff going into avionics that is orders of magnitude more sophisticated and well thought through than anything we are discussing for automotive. ISO 26262 is focused like a razor on the aspects of fault mitigation. If you design and verify the device correctly, you get it into the field and now something in the car fails. It makes the assumption that you have a good design and verification process in place. If you look at the standard, it is very similar to DO-254, where you are supposed to have this solid requirements-driven verification process with revision control and quality assurance. But in reality, who is watching the hen house? In avionics you have FAA certification officials, but in automotive it is very loose. When they do look, they do not know what they are supposed to be looking for, so instead they really focus on fault analysis and fault tolerance. Avionics doesn’t actually do that. They focus on FIT rate and they do back of the envelope calculations to determine they need 12 of a particular system. You can’t do that in a car because of cost. They are solving the problem in different ways. Automotive has a long way to go before it matures into what we have in other areas.
Kelf: Avionics does have lots of redundancy, which you can’t do in a car, but the other problem with the car is that the market is moving so quickly and they are trying to cram so many new things into it. Aeronautical guys take years to go through the necessary paperwork.
Landoll: That is true.
Kelf: This is what they are struggling with. There is a dichotomy that they have between, ‘We need to operate quickly, we need to get new capabilities out there, but we have to meet these standards.’ The requirement flow – the systematic flow in ISO 26262 — gets a back seat to the fault-tolerant random flow, but actually it may be more important. We have looked at the V model and the automation of coverage flowing back into the requirements. You do have individual requirements to which you have to flow the coverage back, so how can you do that and produce tools that automate that flow and make sure they are covered?
Wiltgen: Random faults get all of the attention, but lifecycle management—basically chapter 8 of ISO 26262—is also important. There is a mountain of work. If you talk to anyone who has done a DO-254 project, the amount of coalescing of data and paperwork to satisfy a DER in an audit is huge. There are two beasts though. D0-254 has government oversight. ISO 26262 is more about self-assessment. So does it go more along the lines of DO-254 in the long term? Or does it stay as self-assessment? It is about the customer’s sense of confidence and do they feel okay in a court of law if something happens. Perhaps they will merge.
Kelf: That is a possibility. Who is the regulator?
Wiltgen: The standard does talk about lines of delineation across the division, but…
Kelf: But someone has to enforce it. People in the industry say we are regulating ourselves and then laugh. They do try to figure it out and do the right thing and put together the right models—the different parts of the V model and at which point we prove coverage. We can only do our best.
Bartley: They are going for ISO 26262 compliance.
Landoll: Who says they are the ones doing a good job?
Bartley: Don’t you have to get certification to be an assessor?
Landoll: I don’t think so, but I am not sure.
Dauman: In many applications, people are talking the talk, but there are two conversations going on that need to be joined. First there is safety, and then there is security. Security and safety people are really different people today, and they have to be joined because these systems are highly interconnected. If you have a security problem then you have a functional safety problem. An insecure system can be altered or changed. The system can be controlled. There are two conversation happening, and they are quite far apart.
Landoll: In avionics several years ago at a conference, they made an explicit point (the FAA folks running DO-254) that DO-254, or rather none of the DO standards, really had anything to do with security. They basically said that what we are trying to do is make sure the device is safe and we are not taking into account nefarious actions. The nefariousness needs to be addressed by separation of systems and isolation of system. On a plane, everything operates in completely isolated systems and there would be no possible way that you can hack this and get into the cockpit. The path just doesn’t exist.
Dauman: There was a Jeep hack where they took some control.
Bartley: That was lack of security on the CAN bus.
Landoll: That will be the key difference. In avionics, we have experience where you have a huge airplane and you can physically separate things far enough that it is not going to be feasible to hack some of the mechanisms like a Jeep. But in a car, everything is so hyper-connected that the exposure will be huge.
Dauman: Part of it is also the design cycle. Even in automotive you go back 15 years and the design cycle for functional safety was long and rigorous. Now we have cars coming out with rapid changes. Tesla is a good example of a disruptor in the industry and I am not saying they are not doing testing, but clearly, they are aligned on things such as field upgrades to fix problems. We have seen them roll out updates while you were sleeping. That is a bit of a concern because it is dynamically changing how we are designing and verifying things that are truly safety-critical applications.
Kelf: The other problem, safety, we have narrowed down to ISO 26262 fault injection. We know that path and that process. With security you have Trojans, you have protected keys, checking power rails to see if they can find a signature. There is a huge range of things you have to verify and test against.
SE: Presumably, you have to consider the entire production chain.
Landoll: That means you have to future-proof yourself. If you look at side-band attack, that was not even available until a few years ago. So you can go back and look at an older piece of electronics and now you can hack it. So how do you protect what we are producing today against techniques that will be created three years from now?
Kelf: How do you cover the myriad of things? You can’t just say security. You have to define what aspects you are testing against. How do you do this for different kinds of attack?
Landoll: Many people have looked at software security. We are looking at how to lock down the hardware. If you can lock the hardware such that it is incapable of doing nefarious things, regardless of what is thrown at it, then that makes the whole system more secure. If you have holes, either through bugs that can be exploited, or holes that have been intentionally added, and can identify those and lock those down, then the whole system improves.
Kelf: Can you really do that? We work with SiFive and are looking at notions of a trusted execution zone. How can you verify against that? You have a trusted area and you try and figure out if people can break in. If they create a hypervisor, can they break into the software, the virtual platform on top of it? Initially we said, ‘Yes, we can do this. We can see different ways to get in there.’ When you start to look into the different ways that you can attack a processor using a set of peripherals, there are a huge number of things that you can do even before you start looking at the scan-path and other areas. It is a really complex and tough problem.
Dauman: You can do it by design review. In some cases, this is the state of the art today. You have to start applying the technology that you have applied to functional verification and fault assurance. That has been developed over many years and it has matured and become very effective. You have to start applying those solutions for security and ultimately for security combined with safety.
Bartley: With security, you are dealing with the unknown unknowns. It is the things that we don’t know about yet that will attack us in the future. This is why people patch. Then your concern about patching cars – you need to do it, but that is a security risk. When you layer on top of that the safety concerns… A lot of companies have clauses that get them out of security, but on the safety side, with a car, you cannot have an escape clause in your warranty that says, ‘If there is an attack or problem that we didn’t know about, then we are immune to prosecution.’
Dauman: It is the patching that is critical. It used to be about the state machine that was solely in the hardware and I could test that and go from state to state, and if I get into an illegal state I can recover. Then I have a safe state machine, I have a safe controller. Today, that is software. You cannot just test the hardware anymore. You have to test the system and all of the subsequent patches to that system that may occur. That is a lot more complex than anything you had before to ensure safety.
Ciplickas: Considering the whole production chain is important. While most people here are deep into the design of the systems, and making sure that the requirements are understood and that you are doing everything correctly, do you really know that the chips that you manufacture are what you thought were being manufactured? Is there something that could have been done, even something very subtle, such as a small parametric change in the manufacturing process, nefarious or otherwise, and how might that change something and put you in a zone that you never anticipated? You need to find some way to keep track of the whole genealogy of production in addition to asking if the design is right.
SE: Is that an argument for something like RISC-V? You can take control of the ISA, the implementation, and have a chance of tracking that all the way through to production and to make sure that the entire chain is secure.
Kelf: That is one of the value propositions for RISC-V. On the defense side, that is exactly what they are after. Also, if we look at ASIC versus FPGA, with FPGA you can get into problems where the bitstream can be tweaked at the very last moment and trojans can be brought in. There are not a lot of FPGAs in cars, but for avionics and defense.
Ciplickas: That is the cross of hardware and software.
Bartley: Why would you trust RISC-V more than Arm?
SE: Because you can see it. You control the source, you control what verification has been done, you control the flow.
Kelf: There are also questions about the trust and ownership. Defense circles ask these kinds of questions.
Landoll: You mentioned FPGAs. One important thing in automotive is that we are hearing that more vendors are moving toward FPGAs, or at least thinking about it. Even though they have higher cost and other issues, it does mean that they can do field upgradeable parts.
Wiltgen: With a lot of artificial intelligence applications, when you start to put in FPGAs you can update algorithms dynamically. We have seen a lot of interest in that.
Landoll: So the question is what kind of exposure does that raise? We have really good equivalency checking for FPGAs that can do this, but what you are comparing against is actually what is going into the parts. There are still gaps. In avionics there is a huge concern that you have gone to all of this trouble to verify this part, and then you get to the manufacturing floor and they are loading the wrong rev of the FPGA.
Wiltgen: Fault injection takes an interesting turn when you talk about FPGAs and what your scope of your fault injection campaign looks like. It is still in its infancy, and so is how we approach that as an industry.
Landoll: One fault aspect is that it relies on human beings to sit in a smoke-filled room and contemplate, ‘What if somebody turns on the blinker at the same time this is happening?’ That is a worry case. This set of worry cases becomes the input to the analysis spreadsheet, and then you look at those faults.
Kelf: Indeed. Raising the abstraction of the test vectors so that instead of thinking about the block-based way, or use cases…
Bartley: Hazards.
Kelf: Thinking of those and creating the test vectors, and then thinking about another one. These systems are so complex that it is almost impossible to come up with all of them. But if we can create a specification for the design, and then have a system that walks the spec and generates use cases, then you can end up with thousands of testcases that you could run on an emulator. Then you have a chance of finding weird combinations of things.
Bartley: Hazard analysis comes before the spec. It is trying to find the things that can go wrong that aren’t written in the spec. Then they get billed as safety cases, and safety functions to deal with those. I don’t think Portable Stimulus (PSS) will help because you still have to have that step prior to spec that says what can go wrong and what safety functions do we need.
Landoll: That is why they are hazards.
Leave a Reply