Experts at the Table, part 2: How are we dealing with security threats, and what happens when it expands to a much wider network?
Semiconductor Engineering sat down to discuss industry attitudes towards safety and security with Dave Kelf, chief marketing officer for Breker Verification; Jacob Wiltgen, solutions architect for functional safety at Mentor, a Siemens Business; David Landoll, solutions architect for OneSpin Solutions; Dennis Ciplickas, vice president of characterization solutions at PDF Solutions; Andrew Dauman, vice president of engineering for Tortuga Logic; and Mike Bartley, CEO for TV&S. What follows are excerpts of that conversation. Part one of this discussion is here.
SE: We were discussing the generation of test cases for a requirement. Portable Stimulus (PSS) can be used to demonstrate the ability of a piece of equipment to do something. For safety and security, don’t you have to demonstrate that it can’t do something.
Kelf: Yes, you can bring that element in, as well. You specify everything it can do, and now you walk the things that it can’t do.
Bartley: You need tools that can prove you can’t do stuff. That is what datapath analysis can tell you. You want to know that a piece of data cannot go from here to here.
Dauman: While the smoke-filled room is necessary, it is insufficient. I do not want to rely on a bunch of people conceiving of all of the ways that something can go wrong, or something can get exploited. That is why we need formal methodologies that allow you to explore the space that you can’t conceive of. What you are basically doing is identifying the things that you care about and let the analysis figure out how that could be accessed or attacked without figuring out the exploit or vulnerability. When we are talking about a full SoC, or an SoC plus software, you can’t rely on humans to figure out the complexity. It’s way too large a space for people to be able to conceive of.
Kelf: The only way is to take a big emulator and throw in as many use cases as possible, bad or good.
Dauman: As soon as you do that and ship it, there are hundreds or thousands of guys out there thinking about it, too. So your team is never going to be as big as the rest of the world to exploit it.
Wiltgen: We hear demand for it. ‘Help me with FMEDA. Help me understand the failure modes of my design. Help me with early analysis and show me where my design is safe and when it isn’t. I have some safety mechanisms in place, but show me what I can expect to receive diagnostic coverage wise, FIT-rate wise for that particular design. Show me what I need to do to hit my safety goal when I run through this entire fault campaign.’ That early guidance and analysis is something we hear customers asking for.
Landoll: We have approached that problem with formal. We can go through and do an exhaustive analysis. We basically extract all of the possible faults and then analyze all of them. From a hardware perspective, it removes the FMEDA. Think of the possible test cases. We just look at all of them.
Ciplickas: How can you know that you are looking at all of the possible faults? Doesn’t it depend on the fault model and the interactions?
Landoll: But when you are looking at hardware, there are a finite number of ways. Yes, it is a huge number, but it is finite. You can basically take the RTL, or the gate-level netlist, and you can extract all of the possible ways it can fault. You can analyze every one of them. From a software perspective, you can’t. But at least if you can get a handle on the hardware, then it reduces the chance.
SE: Consider the IoT space, where you have millions of devices connected through a network into the Cloud. Does everyone care about security there? The entire network is vulnerable. When we extend that to autonomous cars, they will soon become part of the network. Haven’t we now got a huge authentication problem? You have to know who you are talking to and if you can trust them.
Wiltgen: We are talking here about V2X—connectivity to infrastructure, to vehicles, to the cloud over 5G, and if you can trust the information you are receiving. Has the device you are talking to gone through the necessary safety workflows? If you are talking to a stop light, does it need to adhere to some functional safety standard to ensure that you can trust the data you are receiving? Same with talking to another vehicle. The problem expands and blows up fairly quickly.
Dauman: There are two levels here. First, there is the inherent design of that system, and then there is the end-user configuration. I have a phone app that controls my car. That phone is connected to my car and my home, and that is all connected. Everything I own is connected, so any weakness in that whole system makes me vulnerable.
Wiltgen: From a security standpoint, you potentially have people modifying their vehicles. Now, does that pose a safety risk to another vehicle that it is communicating with?
Kelf: We must have a system of firewalls. That is the only way to deal with it, so that you are protected against certain activity coming in from everywhere.
Dauman: We have to think of this as an infinitely expanding, complex system. It is not going to slow down. Everything will continue to get more interconnected. This is both a silicon problem, where you have to ensure that the root-of-trust is secure, but everything connected to it also has to be secure for fault safety, for security. I don’t think the design infrastructure or the verification infrastructure is there to help you think about it. So firewalls are necessary but insufficient.
Wiltgen: You will start to see systems come online where you can do digital twin and modeling of systems, and systems of systems, to help close that gap. But that technology is still in its infancy. We can model real incoming stimuli. We can model the electro-mechanical actuation part of the sensor/compute/actuate systems, and even going larger than that.
SE: Is this similar to a medical problem, where a pandemic gets created if you don’t have enough of the population inoculated? Then you can have catastrophic failure and the problems spread.
Ciplickas: Going from the smoky room to the plague.
Landoll: How do we address the problem unless we can go back and to some extent scrap the biggest problem, which is the Ethernet backbone that drives all of this. Flexibility was built into that system on purpose, to allow the configurability at the base levels. If we do not go back and address that underlying vulnerability and replace it with something else that has some meat to it, I don’t see how we address these problems. Are we willing to pay that price? Think about every Internet connected device that would have to change to connect to this new system.
Kelf: The tradeoff is the huge risk. Perhaps it takes a couple of disasters to make people start to think about it and to drive this type of change. It could happen. How do you really rebuild a whole secure system that deals with those kinds of risks?
Ciplickas: Do you have to rebuild, or can you have things next to it? Two-factor authentication is one way to make sure that if you break through the firewall you still have to prove that you really are who you said you were. Multi-factor authentication in parallel with Ethernet possibly. If I am a stop light, do you really know it is a stop light? How do you authenticate yourself?
Bartley: Key exchange mechanisms can help with authentication.
Dauman: Maybe it can’t happen incrementally. It is when someone chooses to differentiate on safety and security, rather than performance or interconnectivity. That can happen at a subsystem level, and in your model that works perfectly. I can build a vehicle that has these two infrastructures to increase my security. But that hasn’t happened yet. Maybe it will. How much more performance can we get out of a CPU, and does anyone actually care anymore?
Ciplickas: It may need some catalyst.
Bartley: Right. Most companies do not differentiate on safety and security. They see that as a hygiene activity.
SE: Some companies have had the wakeup call. Think about the processor companies that traded off performance for security. They may not have understood the tradeoff at the time, but they do now.
Bartley: Going back further in time, the Pentium bug forced them to do a lot more verification. They invested a lot of money after that, but they still didn’t sell it as a formally verified device. It was still a hygiene activity. It forced them to take it a lot more seriously. The cost of failure can be huge. If that goes up high enough, then change happens.
Landoll: From a philosophical perspective, if you look at North America, there are roughly 30,000 deaths a year as a result of driver error. If you look at autonomous cars, one would think that if we have everyone with autonomous cars, then there are 29,999 deaths as a result of autonomous cars and we have improved things. But in reality, the public tolerance for even a fender bender seems to be very low. From a risk perspective, if a company looks at this from a standpoint of rationality and follows the money, it might be that they are willing to tolerate a certain amount of problems because it is better than what existed before.
Kelf: But think about the lawsuits.
Landoll: That is from the perspective of the car company, not the insurance company.
Kelf: The problem is, if it is a driverless car, the person you will sue is the car manufacturer. They have lots of money, versus the poor driver who potentially loses their house.
SE: That is just a legal framework issue.
Bartley: There need to be enough test cases first. How many payoffs do they need to make? But at some point, a test case will actually go through, and it will be decided who has to pay. So, there are 30,000 deaths today, and there would be a reduction with autonomous cars, but there is a risk that it will go to a million if there is an attack on the infrastructure. That is something we cannot analyze. If a person is at fault, that is fine, but if a machine is at fault, public tolerance is a lot lower.
Dauman: If I can connect a few dots, we talked about processor problems and when that was announced a year ago, it was the most publicly announced security vulnerability in silicon. Since then, there have been a handful more variants on that. We know these problems exist, and perhaps that is an inflection point for some companies. What could the business impact be? Could they move to a different processor that doesn’t have these problems? Or maybe I have to be able to look inside and convince myself that it is safe.
SE: Is that likely to change anyone’s thinking? Will people consider RISC-V as being a lower risk solution because the vulnerabilities are exposed?
Bartley: While you may not be able to see inside Arm’s Trustzone, they have been doing it for a long time and they have a track record. Now, if a car has a problem that was caused by an insecure processor, then it gets even more interesting.
Wiltgen: That is part of the supply chain.
Bartley: Yes, in automotive, the supply chain has tier after tier after tier.
Landoll: This has to be part of the legal framework that has not been fully established yet.
Wiltgen: Several avionics companies say the reason we adhere to DO-254 is because we can hold it up as a shield against lawsuits. Did I do everything in my power to design, verify and test this to the full extent of the available technology? Yes.
Kelf: As per the standard.
Bartley: ISO 26262 will be the same. It is not a legal requirement, but it shields them.
Ciplickas: That is optimistic. You are saying that we get ahead of the terrible event that happens. The conversation about RISC-V lets you see inside. Maybe Arm comes up with a way to overcome that threat to their business, but the fact that people are talking about it – knowing the supply chain and questioning whether it was manufactured the right way — if you offer that as something for people to think about, they will start to change the design of their systems to include that. I like the optimistic point of view.
Wiltgen: It will be interesting to see what the sharing mentality will look like when you have all of these OEMs and tier 2s and 3s collecting all of this data, perhaps accident data. What will the sharing of that data will be like? Will the industry be altruistic and make the world a safer place, or will we compartmentalize? Safety has been considered a value add for many years. It scares me.
Kelf: Safety is more of a hygiene issue, so will it remain a hygiene issue. Or will there be a Volvo that says our chips are being tested to ‘this’ level?
Bartley: They do sell a little on their safety record. When consumers start to care about it a little more, that is when people start to sell that as a feature. Do they care at the moment? Do they care if their latest car is secure and safe?
Landoll: Yes, but are they technically qualified to make that decision? They are vulnerable to marketing.
Kelf: And the media.
Landoll: Many years ago, there was a school shooting that involved kindergarteners, and immediately friends were asking if they should allow their child to go to school. I did some risk analysis and looked at how many people die in shootings, and how many people die in car accidents. If you are worried about your child, then strap them into their car seat because that is much more likely to cause their death while driving them to school. The media tends to hype whatever the problem is because it makes for great news. This makes the public vulnerable to this messaging. This will cause trouble. A single event will be the splash and cause everyone to worry, when in reality that is not the real risk. We as an industry have to be more cognizant of it and have to provide the voice of sanity and determine what the real risks are and address those. At the same time, we will have automotive companies that have to market to what people are sensitive toward.
Kelf: It is newsworthy. A Tesla wrapped around a lamppost is newsworthy.
Bartley: More people are kicked to death by donkeys than die in airplanes.
Kelf: As an industry how do we counteract that? If the media shows that autonomous cars bring the death rate down from 30,000 to 10,000, that is not an acceptable argument. There is no argument against it, and that is the fundamental problem. The only argument we have is the shield and the technology we have behind us to protect people from it. We are doing everything we can to protect against accident. What we should say is, ‘Look, we have decreased the number of deaths by 30%. That is good and we are sorry that this one still happened.’
Ciplickas: Not good for the other 70%.
Leave a Reply