Physical Access Control Raises New Security Concerns

Small language models, longer device lifetimes, and thermal manipulation make securing hardware much more challenging.

popularity

Experts At The Table: Semiconductor Engineering sat down to discuss hardware security challenges, including fundamental security of GenAI, with Nicole Fern, principal security analyst at Keysight; Serge Leef, AI-For-Silicon strategist at Microsoft; Scott Best, senior director for silicon security products at Rambus; Lee Harrison, director of Tessent Automotive IC Solutions at Siemens EDA; Mohit Arora, senior director for architecture at Synaptics; Mike Borza, principal security technologist and scientist at Synopsys; and Mark Tehranipoor, distinguished professor in the ECE Department at the University of Florida, and co-founder of Caspia Technologies. What follows are excerpts of that discussion. Part one is here. Part two is here.

L-R: Caspia’s Tehranipoor; Rambus’ Best; Synopsys’ Borza; Synaptics’ Arora, Keysight’s Fern; Microsoft’s Leef; Siemens EDA’s Harrison.

SE: How does GenAI impact security?

Tehranipoor: The chip industry prioritizes performance over security, and GenAI is following that same path. The excitement is driving rapid innovation and product offerings, often at the expense of security. That’s how the industry operates — solve performance issues first, then revisit security. I believe this will continue as customers demand immediate performance gains. When it comes to the security of GenAI, a lot has to do with data poisoning, data contamination, and so on. If a bad guy is doing GenAI development, they can add malicious activities. This goes back to the data modeling. The large language models and the size of the data they have to use for training make it extremely difficult to pinpoint security problems injected into the data. My biggest concern is not the security of large language models. My biggest concern is the security of small language models. LLM is going to see its plateau soon, in my opinion. But what’s coming out are small language models, which we’re going to see in pretty much every edge device. And if you try to infuse and inject wrong information into it, that could potentially result in much more direct types of security problems than large language models, with billions and billions of parameters and information that could potentially cause that issue. Once we get to small language models — which basically would be the foundation of many, many, many agents, and how they’re going to be working together, and how you orchestrate them — that’s where the real security is going to go from.

Best: In the early days of processors, attacks targeted the mix-up between code and data, leading to protections like Data Execution Prevention. Today, with large language models and small language models, the attack surface is much broader, encompassing vast knowledge. Currently, there’s no clear way to authenticate training data, especially for small language models. We need a way to verify trusted training data, similar to how we authenticate code now, though the solution is still unclear.

Leef: If the world is covered with distilled small language models, that opens up a whole new attack surface. Attacking big large language models is really hard. It accepts some level of poison, but small language models are smaller and more vulnerable. This is food for thought.

Fern: The proliferation of AI in literally everything, all these different application domains, makes it hard to determine the feasibility of what’s possible for certain attacks, what the impact is, and where the weakest link is in the larger context of an AI system. So it’s not just about flipping weights in a neural network. Taking the context of fault injections specifically, you could also target whether it’s a classification output from the network and the confidence score that goes with it. That’s one way that you could potentially cause downstream security violations. If it’s used for authentication, you might say, ‘Oh, the output may be correct.’ But if you can convince the system it’s really low confidence, then it might react differently. Then, at the input to the process, you also have a lot of input sanitization techniques, and maybe you could inject faults at that stage. Also, if you’re doing a specific application like facial recognition or palm scanning, you may have liveness checks to detect if it’s being attacked or prevent replay attacks before it even goes into the main part of the network. With all these different applications, they’re going to deploy different techniques around the AI to make it more resilient, adding some protection against trivial data manipulation at the input. This is a really large space that we haven’t explored, but it will be important.

Arora: Because AI is heavy on data, data poisoning is an important issue. And it’s all done at run-time. You’re loading the models in and out, and handling the waves in and out. There are multiple models owned by OEM and ODM users who are trying to cross-pollinate the shared data. So run-time security now becomes more mainstream. It’s no longer an option. Attestation, ensuring that every time, even though your weights are encrypted and signed, decryption is done within the DV environment. You have to go by the ground rules. It doesn’t completely avoid poisoning, because it’s done offline. But for runtime, you must make sure, like any opportunity where the data can be manipulated, that you have locked it in there. In the past, physical security was out of scope. Not anymore. Here, it’s a much lower cost of attack. Now, something that was $5,000 can be done for $100 or $200. That’s where the bar has gone higher.

SE: In the context of direct manipulation of not necessarily just data, but memory and systems, are thermal attacks something to be concerned about?

Borza: In general, thermal is one of the attack vectors that people use, so it typically means heating things, but occasionally cooling them down to extremely cold temperatures. With a lot of cold boot attacks, you’re freezing the RAM to maintain the state so it doesn’t dump the memory. A DRAM can hold its state for a long time after the power has gone down — long enough that you can read it and get the information out of it. The same is true with heating things. You can get things to misbehave, or you can push them into a corner where they’re no longer operating reliably. Doing that in a controlled way allows you to attack it, and that’s one of the physical attack modes in which you typically need to have physical access to the device. However, one of the things that’s very interesting is, with the advent of on-chip silicon lifecycle management and performance management using SLM sensors — like PVT (process, voltage and temperature) sensors that are built on chips — if adversaries or attackers can access that data, they can manipulate the system below the chip level to heat up or cool down, or speed up or slow down parts of the chip in a way that might give them a way to break into the processing. Essentially, it creates some kind of access control violation. There are physical attacks, some of which can be implemented remotely now, because there is access to the sensors on chip, whereas you used to have to use off-chip sensors to measure those kinds of things. But those are specialized kinds of attacks. There are attacks that people have to worry about, depending on what they’re doing. It’s not necessarily a runtime attack, but it’s very useful in reverse engineering, because you can get access to lots of internals that you may not otherwise know about.

Tehranipoor: I have a completely different opinion from Mike on this. The reason I don’t look at thermal as much is that when you think about the power side channel, you can establish a one-to-one relationship between what you get as power in each clock cycle against what is being processed. When you go to EM, that resolution that we’re talking about per clock cycle shrinks a little bit. It’s harder to do that, but it still is effective. But once you go to thermal, it’s much, much harder to do this. Some of the examples that Mike gave, I agree with. Memory freezing, etc., within the context of the system, has been used very often in a cold boot attack. But if you’re talking about SoC designs or system in packages, thermal is extremely difficult to capture and make sense of. It may require a lot more data than one could imagine. Thermal takes a lot longer to build. There’s a time cost, and as a result, establishing that one-to-one relationship between what your attack objective is and establishing locality to be able to target specific locations and capture sensitive information from that particular location is extremely difficult, and what comes out as a global thermal that you measure is different.

Best: I don’t think you can measure the thermal because of the time constant. There have been fan speed attacks where you listen for the sound of the cooling fan, because the cooling fan gets heated up at much higher data rates than the actual thermals.

Borza: But that fan is driven by an on-chip sensor, and that sensor changes fairly quickly. I’ve looked at the data coming out of those and what’s changing in sub-seconds, so it’s not clock-cycle-by-clock-cycle accurate. But if what you’re looking for is some larger process, like an authentication of a secure boot or something, people have used thermal as a way to accelerate and make their attacks more predictable, just as they use clock glitches and voltage glitches.

Tehranipoor: If we look at thermal as a one-step attack scenario, I stand firm on this because it’s extremely difficult to make sense of it. But if you use thermal as a way to get to the sensors and then capture the data from the sensors, that gives you a much better resolution of what’s going on inside the chip. And that really can become more like a power or other type of attack that you can carry out, because with those sensors — especially if they’re distributed well inside the chip — you get a lot of other information, which makes it a lot more effective. But as an outside measurement, I don’t see thermal to be my biggest concern.

Harrison: With the whole introduction of SLM, thermal is just one piece of the puzzle. There’s a whole ton of other sensors, and we have a lot of customers that will not even go down the route of SLM unless that whole data network is fully secured. And you think this is weird because we just look at the parameters on the chip, but there’s a big security risk as part of that, as well. First off, they want to make sure that access to those monitors and sensors on the chip is fully secured.

Borza: It’s access control. It’s private. The data is only in particular locations that privileged processes can look at.

Arora: But I cannot disable them once they are provisioned, right?

Harrison: If you think about it, as SLM has evolved, JTAG was the standard access mechanism, which anybody could just plug into and go. So, all of a sudden, I want my JTAG to be completely secure and then go to the next level. It’s okay if you’ve got the secure access. Also, I want to make sure that as the data comes on and off the chip, that’s encrypted, as well. There are multiple levels of security, and this is just the SLM infrastructure, so people see that as a big attack surface.

Best: This is especially true in chiplets, where now you have these sensor chiplets that are communicating over I2C, etc.

Borza: Also, in some of these accelerators, for example, people are using the SLM network in real time to do load balancing in the chip. There is sufficient resolution to do that. That’s a fairly coarse process, and you don’t need nanosecond accuracy for that. But you do need at least millisecond accuracy.

Arora: And in most of the chiplet communication protocols, they have temperature data as a side communication, which you can easily fault and attack.

SE: Is AI-driven security the only way to ward off or try to protect against AI-based security attacks?

Fern: Maybe to get a scaling effect, but the existing techniques for mitigation of certain classes of attacks are still completely relevant.

Tehranipoor: I absolutely agree. We still need heuristics to check on AI.

Harrison: If you deploy AI as a security mechanism, then you still need another mechanism to secure the secure AI, so we’re kind of building it.

Arora: It should give you awareness, but the trust should come from the hardware. That’s one of the key aspects. Where AI could be effective is in anomaly detection, like fuzzing logic, where you’re generating data for firmware testing, but you are not disturbing the pillars of security. It’s more in terms of how to handle verification and how you handle all the key management.

Best: In any sort of feedback system, there’s always been a concept of gain margin and phase margin if you’re using something to measure it. So, if you’re using AI to track the AI, there is a stability problem that needs to be resolved. We know how to do that.

Leef: This still boils down to expert knowledge. If someone were to extract from the brains of everybody in this room everything they know about security, they could construct a pretty effective offensive strategy. But the same goes for the defensive strategies. Large language models, on their natural course of evolution, are not suddenly going to become security experts, unless that kind of knowledge asset is exclusively brought into that. So my non-answer is that both sides will try to take advantage of this, and increasingly, this is the domain of nation-states, not people who are trying to steal access to your cable box.



Leave a Reply


(Note: This name will be displayed publicly)