There is no single solution, and the most comprehensive security may be too expensive.
Experts at the Table: Semiconductor Engineering sat down to talk about how to verify that a semiconductor design will be secure, with Mike Borza, Synopsys scientist; John Hallman, product manager for trust and security at Siemens EDA; Pete Hardee, group director for product management at Cadence; Paul Karazuba, vice president of marketing at Expedera; and Dave Kelf, CEO of Breker Verification. What follows are excerpts of that discussion. View part two here.
SE: How much security can be done up front during the design phase? Can we verify that a device will be secure?
Kelf: Security is all about figuring out the state space around a vulnerability, and then trying to verify that. The verification is negative verification. When you do functional verification, you verify whether something is functioning correctly. In this case, you look at a secure element inside a chip — Arm Trust Zone and RISC-V PMP (Physical Memory Protection) are good examples — and make sure it can’t be accessed through some peculiar state combinations. We try to verify around that to ensure there are no other vulnerabilities.
Hallman: The state space is clearly a good area for focus. But can you reach all of it? Probably not. You have to pull in your constraints. There’s also a question about visibility into that state space. Are you able to get to that lowest level of design? And are you delivering a black box at some point where you can’t have that visibility inside? We’re trying to push as much of that visibility forward, whether it involves properties or other descriptions about that IP. How can we verify that as early as possible? Can we get the IP developers to do that verification? Can we get as much of that state space as possible closed and understood? But on top of that, we’re also providing information outside of that IP developer. How can you pass verification information that you might need for security on to the next level, where you can do analysis at an integration level, and possibly at a later implementation stage? I don’t think we’ll ever get away from being able to check security at different points throughout a device’s lifecycle, whether it’s being developed, or even at an operational stage. That’s where the digitization and digital twin efforts will help. They’re providing a way to do verification early and throughout the lifecycle of a part.
Hardee: As far as negative testing with a PSS-based tool, that can help you get above the limitations of state space and write some meaningful test programs, which help to verify that certain security vulnerabilities don’t exist in your system and can’t be exploited by software or non-secure processes. Key to being able to develop some of those tests in PSS was indeed extending PSS to be able to cope with the negative testing. We’re seeing exactly the same thing in terms of the state space issue, and we’re seeing a lot of formal being used. We have so many people implementing processes, multiple issues, and out-of-order processing, and a lot that involves extremely comprehensive tests so they’re not vulnerable to the known side-channel attacks that occur with processing architectures, such as Spectre, Meltdown, and PACMAN. Formal can be used at the sub-system level, too. We have have multiple examples where Arm and Intel will talk to the Jasper User Group about how they’re using formal to verify load store units and those kinds of things. And at the system level, things like digital twins remain important. But when we talk with customers about realistic expectations, there are all kinds of side-channel attacks that folks are interested in. Some of those can be tested for and eliminated at the architectural and RTL stage, while others cannot. Some of these involve how an attacker might interact with your power system, or how crosstalk may leak secure data, either on power signals, clock signals, or any other signals. Those are nearly impossible to verify unless you have an exact physical representation of the system. The models you need to simulate some of that stuff gets so detailed that even a digital twin cannot cope with it. There is no replacement for the physical attack lab kind of testing that needs to come later. So you need to set the expectations for what can be done at the RTL design stage, at the architectural stage, and what really has to be done later. Having people understand those different categories of side channel attacks is key to success.
Borza: A lot of what we’re doing in verification is similar to a kind of an ISO network stack, where you have to verify at each level corresponding to the design abstraction level at which you’re working. At RTL, you have a certain level of verification. It’s possible to do some analysis and simulation on the physical designs after place-and-route, but that gets to be much more elaborate and takes much more computation to do a lot of that verification during those phases. What you end up doing is a lot of post-silicon testing to verify your actual design achieved the objectives you had for it. Processor side channels are like new kinds of timing attacks. They tend to be time-based, as opposed to power- or energy-based, but there are some power side channels, as well, that people need to be concerned about. And that doesn’t even get into some of the physical attacks, like photon emission. So there are many layers you need to be concerned about, and you have more and more sophisticated adversaries the closer you get to the physics. But they’re there, and people are worried about them. It really makes doing a lot of work on verification essential to trying to ensure that you’ve achieved your security objectives.
Karazuba: One aspect that needs to be looked at is the economics of security. You can put a group of three dozen engineers in a room and say, ‘Develop the most secure chip possible. Add resistance against side channel attacks, put in a root of trust, deploy the most advanced crypto ciphers you possibly can. Design a production system where we have verification of a device in the fab, and where we’re inserting keys throughout the entire process so we can ping it anywhere in the field and guarantee that it’s running correct firmware.’ You can do all of that, but it’s extremely expensive. And that may not fit the cost profile of what you’re looking for in a particular chip. Is it possible to develop a truly secure device? Absolutely. Is it economically feasible to do that? Yes and no. It depends on the risk profile of where you’re deploying a device and what kind of secrets it will hold. What’s the risk of your companies being hacked? What’s the PR risk of your company being called out on the front page of The New York Times as the company that let out state secrets or company secrets?
SE: A lot of designs we’re looking at today are more customized and produced in smaller volumes. We have new process nodes, mixed-node designs, chiplets, and new bonding methods that potentially are not as perfect as everyone hopes. How much of this can be secured up-front?
Borza: It’s essential to do analysis every step of the way, and the reason is fairly simple. The earlier you can catch an issue and fix it, the less expensive it is because you spent less time and effort elaborating it into a final design. So there is that balance of the cost of verifying things early versus waiting until later when you can go look for them. But in general, it’s less expensive to fix problems early on. And then, at each level, you need to make sure that if you fix something, you don’t reintroduce it at the next level down. That’s one of the challenges. You often break something you think you fixed with a change or a decision that’s made later in the design process — or even during implementation at the fab.
Kelf: As we’re designing more complex and more customized systems, we can build more into the designs for closing vulnerabilities. As we look at some of these new applications — and automotive is the obvious one, but also medical and other things like that — are there specific vulnerabilities associated with those applications, for those custom devices that we can target with some clever design work, even before we verify? For example, you want to make sure someone’s pacemaker cannot be controlled through some satellite link. How can we design something at the architectural level that will avoid some kind of strange effect being used on the power rail later in the process. We’re seeing much more of that now, and it’s really going to become more critical for some applications.
SE: To some extent it also depends on what people are looking to steal, right? So just being able to take over the hardware is one thing. But data leakage, where you can collect small amounts of very important data, is important too.
Karazuba: The value of what someone can get, whether it’s data or physical access to something, highly depends on the level of security you’re going to apply to it. When you talk about life or health using the pacemaker example, obviously that’s going to be extremely important to the owner of the pacemaker. Some of the best security in the world from a deployment perspective is around stock markets and the billions or trillions of dollars trading through them. A smart light switch has a lot less monetary value, but in the wrong hands it could create a real issue for a high rise building. Ford applied for a patent about a self-repossessing car. If I really want that Mustang GT and I could hack their system, it could self-drive to my garage.
Hallman: One of the goals is to identify these common areas of security and to design in those pieces early on. That raises the bar even for an IoT device that may not really need all the security. But if that security is baked in earlier, you’ve raised the bar for the entire industry by putting in some of those common elements upfront. So that’s where designed-in security really does have benefits. But we need to keep checking throughout each of these processes that security is not compromised anywhere along that chain of development.
Borza: Going back to that example of the IoT light switch, if you can use that IoT platform as a foothold into the house, you have access to much more value than a light bulb. So the switch manufacturer doesn’t see it had a big role to play in this because it didn’t make much money on the switch. It wasn’t protecting much of interest. But the problem is the things inside your house. So now we’re talking about more architectural solutions to that problem, like putting the IoT network into a separate network that’s physically or logically isolated and protected by a firewall. You’re starting to see that concept emerge, where the IoT is in one place and it’s much more difficult to use it as a foothold to get all the way into the house.
Hardee: IoT/edge devices are a big area of concern. The customers we’re selling to are increasingly security-conscious. We’re also seeing a huge number of trading companies creating their own AI/ML chips, and being able to get microseconds or nanoseconds of latency improvements in the speed of trading is becoming an extremely big deal for these applications. Security is going hand in hand with that, because being quicker with a huge volume of trades is make-or-break for these companies. Security is obviously a big factor in that, as well. But going back to the original question about the huge increase in in scalability with multi-chip modules and various other things, you can’t test for a lot of the physical side channel effects. You have to go post-silicon. There are still extremely good design practices in terms of what makes a secure power network, a secure way of communicating with the protocols for these new huge-scale devices. And, of course, test and debug introduce whole different classes of security vulnerabilities. We have people working on every aspect of those problems, but you have to test and verify at every step along the way. There is no single answer.
View part two of the discussion:
IC Security Issues Grow, Solutions Lag
Leave a Reply