Experts at the Table: Waiting for secure designs everywhere isn’t a viable strategy, so security experts are starting to utilize different approaches to identify attacks and limit the damage.
Semiconductor Engineering sat down to discuss a wide range of hardware security issues and possible solutions with Norman Chang, chief technologist for the Semiconductor Business Unit at ANSYS; Helena Handschuh, fellow at Rambus, and Mike Borza, principal security technologist at Synopsys. What follows are excerpts of that conversation.
(L-R) Norman Chang, Helena Handschuh, Mike Borza. Photo: Paul Cohen/ESD Alliance
SE: As a couple of years ago, no one really thought hardware was a viable attack surface aside from governments. That changed with Mirai, Stuxnet, Foreshadow, Meltdown and Spectre. What’s the best way to stop these attacks?
Borza: That’s an interesting range of attacks, because it covers a lot of different things that were broken in order to get there. The micro-architectural attacks like Spectre, Meltdown and Foreshadow have to be solved right in the micro-architecture of the hardware. It’s really about the processes executing things in a very consistent way, and we’re even starting to see side channels through very fundamental things like the structure and nature of the caches. Those are generating some side channels that we haven’t seen before. But because this is such a new area, we can expect to see a lot of those attacks go on for quite a while. It will take the next several years for the academic research community to get through all of them. Putting an end to them involves very careful design, and design for security, all the way through the flow. And then at each stage of design you need to verify that you’ve addressed all of the threats that are in your threat model. Without that, there’s really not a hope of resolving things that are very much hardware problems.
Handschuh: Side channel attacks have been around for a long time, but the way they are approached now is based on micro-architectural issues. This is somewhat new. Where they come from is speculative execution and things like that, where you try to gain in performance by looking ahead and trying to anticipate what you’re going to do next. So maybe there’s a way for us to consider having a separate small security processor on the side that will take care of your security operations for things that really matter security-wise—those things that are really sensitive—and leave the rest of the system that needs high performance to deal with less-sensitive issues.
Chang: There are various forms of side-channel attacks, and various ways to find them. You can look at the substrate noise, voltage noise, and thermal changes. You can even look at the power calculation for each instance, in different levels in RTL, from system-level design all the way through the layout. This is one of the major concerns that we should be able to simulate, not wait until the sign-off stage when you see the chip coming back. And through measurement, you can see how effective the security design is.
SE: You also need to take a look at this over time, right, because there may be sleeper Trojans. Some of these may not wake up for several years, so you have to monitor activity throughout the lifetime of the product.
Handschuh: Hardware Trojans have always been a difficult issue to deal with. It’s very difficult to come up with a ‘golden reference’ to which you can compare things to make sure you don’t have these Trojans in your hardware. That’s very complicated, because there’s just no way of getting these great references. There are some ideas around how to try to deal with this, but it’s difficult without analyzing the design itself and almost take it apart to make sure there’s nothing in there. There’s no proven way of doing this today. This is definitely a big, open research topic.
Borza: There’s an interesting range of solutions that people are proposing. That includes reverse engineering the chip—essentially do a full layer by layer tear-down of the chip with an optical inspection of the layers, reconstruct the transistor-level model, and from that the gate-level model. And then you have to compare that to what was forward-generated through the synthesis process into the back-end design. But when you start talking about doing those kinds of things it has incredible implications for both the cost and the complexity of the analysis, and time-to-market if you’re going to complete that analysis before you actually ship product. It’s a very long process to do a full reconstruction of a chip by reverse engineering it from the ground up on a large SoC, or even on a small SoC. The chip reverse-engineering companies can do just about anything, but you’re going to pay through the nose to have a big piece of the chip done, which is why people use those companies to look at a very small subset. If they want to look at a particular interface, logic, or even the physical design of the I/Os and things like that, to look for patent violations, that’s a very different task than reconstructing a big SoC from the ground up. And that’s just one approach. Some of the other approaches are based on monitoring, looking for variations in power analysis to indicate activity that wasn’t designed into a part, which suggests there’s something there that you don’t know about. That’s one approach that people are taking, and it uses a lot of the techniques from side channel analysis to become kind of detection technology.
Chang: I just want to mention a story I heard last year about a virus in a Windows system. After 15 or 20 years, it is finally getting into a nuclear power plant control system. So that’s one example of a very-long-term malicious attack.
SE: And sometimes these are not even intentional attacks, right? Sometimes its weaknesses in the design that show up years later and create an opening for breaches. This is what happened with branch prediction and speculative execution, which seemed like good ideas at the time they were implemented. How do we solve these kinds of issues?
Borza: Power, performance and area are dogma in every IC company. Minimizing logic is exactly what the optimizer phase of the synthesizer does. The only way you’re going to get that out is to educate the designers to get them to consider security as a first-class design parameter.
Handschuh: Security by design is exactly right. You need to build it in from from day one, from scratch, and have it in mind while you design. Otherwise, you’re never going to get rid of all of these issues. But there’s another approach that we could take, which is to start small. So yes, high performance is great. That’s what everybody wants to achieve. But security-wise, we need to build up from very tiny pieces that have fixed boundaries and that we can control. Then we can look at and analyze everything that’s in there, so we know that part of the system is secure. And then you bring in more pieces of the system and try to secure what’s around it. If you try to secure the entire design from the beginning, and it’s a huge design, that’s very hard to do. But you can start with small layers that build on top of each other, starting with something that you have reasonable confidence in and that you maybe even can formally verify, and then continue building from there.
Chang: For the big companies, security is probably very well designed into their products. But there are billions of IoT devices out there, and we don’t know how the security was designed for those devices. There are IoT devices are everywhere, and when they are connected to a network, that can be a weak point for breaching the system. If we look at electronic design services, including the hardware design, we don’t have an ecosystem that is foolproof. We are just starting the provide security solutions. And we need to work with customers and companies like Synopsys and Cadence and Rambus to provide a full ecosystem solution. Security is just like any other function that we need to work on.
SE: The edge supposedly is going to be this enormous new space, but we have no idea what it’s going to look like or even how to define it. How do we build in security up front in this case?
Borza: It’s a great idea to incorporate small secure elements right into the devices at the outset that are used to ensure that the software, which is where the system ultimately gets all the personality or behaviors from, is as secure as possible and can be updated. That’s fine to say, but in an IoT world where there are 50 billion devices, 45 billion are going to be the smallest, cheapest things that are made. The designers of those chips or going to tell you, ‘This is extraordinarily cost-sensitive,’ because if they’re selling a part for under $2, they’re going to get beaten out of the market if they add another 10 cents worth of stuff to make it secure. People will go elsewhere to buy something that’s not secure, as long as it’s 10 cents cheaper. So we’re going to need to design the network to be robust against the kinds of attacks that are going to occur, because it’s unreasonable to expect that 45 billion of the 50 billion devices that are the smallest, cheapest things that are the easiest to attack are going to be made securely. This is the hard reality of where we’re headed.
Handschuh: Yes, we need to be realistic, because not every device out there is going to have a perfect security solution in it. There are two angles to this. One is start building from small pieces of the system that are secure enough to try to help. The other angle is to add some software that will help you detect abnormal activity, like intrusion detection. In the automotive sector, it’s very common to try to add that now. You have to do both things—build in security if you can, as much as you can, but then also have something available that will detect anything that looks weird and that shouldn’t be there. You want to try to analyze how the system is behaving normally, and then if you detect something different you need to decide what is safe to do because not every system will allow you to just shut down. You try to build in security up front, but you also need to be able to detect afterwards in the field what’s going on and try to react to that correctly.
Chang: I totally agree that we cannot count on the billions of IoT designers to come out with very secure designs. So this new trend is to do more with chip monitoring, like watching the traffic in and out of the CPU and memory. This is a new area, but we need to develop solutions where we are not counting on the designer to come out with a perfect design. It’s looking at a system from a different angle to attack this problem.
SE: So you’ve flipped around the security paradigm here, recognizing that some bad actors eventually will get in. Now you want to be able to shut down systems, reboot securely, and get everything back into operation as fast as we can, but maybe with some redundancy in critical cases where another system can take over. But you also have updates to data and algorithms, and you don’t always know how those will behave over time? How do you adapt the security for that?
Chang: If you have a system to assist the driver to look at the road, ADAS needs to have a fault-tolerant system during the update. If the ADAS system shuts down during a power outage while you’re driving a car, that would be a disaster. So you need fault tolerance for the mission critical systems, and security attacks need to be considered in a mission-critical design.
Handschuh: I would look at it from a different perspective. I would try to build a system in layers in such a way that each layer only has certain privileges. The higher up you go toward the user, the less privileges you actually have. So the hardware will keep all the secrets, all the sensitive information, all the data that you don’t want anybody else to see in that spot, and then the layers of software you build on top all get different rights to do things. Even if you use software and you don’t exactly know what what is going to happen, because maybe it didn’t get checked properly, it can only so much harm if you build the system correctly. It will be able to expose certain things, maybe just what that layer of software has a right to see. If it’s built correctly, you can build against yourself in some sense, but it can’t go into the lower layers where the sensitive data is being housed and managed. Maybe you have a right to manipulate things, but you can’t actually take them out and expose them.
Borza: You’re starting to see AI being applied on the defense side, at the network level, and looking at behaviors of devices interacting with other devices around them. That is a very promising area because it allows you to build adaptive behaviors for the detection of various kinds of malware and things that are going to be flipping around the system. That gives us some hope that we have adaptable defense strategies that are examining what the behaviors are between the systems and taking a kind of higher-level view to understand what the behavior is relative to what it should be.
Related Stories
Creating A Roadmap For Hardware Security
Government and private organizations developing blueprints for semiconductor industry as threat level rises.
Who’s Responsible For Security Breaches?
Part 2: How are we dealing with security threats, and what happens when it expands to a much wider network?
Can The Hardware Supply Chain Remain Secure?
The growing number of threats are cause for concern, but is it really possible to slip malicious code into a chip?
Complexity’s Impact On Security
How interactions between components can compromise AI inferencing models.
Security Knowledge Center
Top stories, white papers, videos, technical papers covering security issues.
Leave a Reply