Designing For Security

Adding security features into SoCs can affect power, performance and the overall NRE costs in an SoC.

popularity

Some level of security is required in SoC today, whether it is in hardware, software or — most commonly — both. Of course, there is a price to pay from a power and performance perspective, but thankfully just a small one in most cases.

The explosion of consumer devices has driven the need for increased security features in smart cards, smart phones, personal computers, home networks, and set-top-boxes, all of which store, send and receive personal and financial data, and encrypted codes and keys.

“With automotive and Internet-of-Things (IoT) applications, we’re opening up many new classes of devices that connect to the cloud and bring new security concerns, including personal, medical, home security data, and safety-critical systems that could be vulnerable,” said Pete Hardee, director of product management at Cadence. “As well as encryption for sending and receiving secure data, there’s partitioned areas in SoCs that must keep secure data away from non-secure areas.”

The impact of security on the design of the SoC are many, he said. “First, for encrypted data, the data rate decreases and the energy per bit increases since more total data needs to be sent for a given data payload size. On top of that, there’s the encryption and decryption circuitry itself. For secure areas, there has to be a duplication of resources – dedicated processor, memory, interconnect, etc., to handle the secure data. But the biggest impact is in verification – there has to be rigorous verification that secure data cannot appear on non-secure I/O ports. And in today’s society with all kinds of cybercrime it’s a two-way problem. You must guarantee no secure path leaks, and you must also be able to protect against security attack.”

At the IC level, Serge Leef, general manager of the system-level engineering division at Mentor Graphics pointed out three fundamental areas where security work is happening.

“One is what I would call on-chip countermeasures, which are techniques that protect crypto-engines on silicon from side-channel attacks. In side-channel attacks, a number of techniques are used — the most prevalent being DPA (differential power analysis) where candidate keys are sent to the crypto-engine and the power dissipation is observed, which presumably is different whether you get the right key or wrong key, or partially right or partially wrong. And depending on the power signature, you presumably can make certain conclusion about various bits of the key.”

He noted that power matters in this case and the most obvious countermeasure to combat side-channels attacks would be to alter the design so that it either has a consistent and constant power dissipation profile or a random power dissipation profile. “Either of those has impact on power. If you wanted to even out power, that means the periods of operation where power dissipation was below no longer can be there because they would expose the valleys. Similarly, if you create a random power dissipation signature, that would presume that you are consuming power when it’s not necessarily to consume it for operation of the device.”

Bernard Murphy, CTO of Atrenta, agreed. He said it almost always includes some layer of functionality, whether it is in hardware or software or in analog or around the packaging or camouflaging the layout.

He noted an additional countermeasure for side-channel attacks is to add analog noise above the regular probe file or mirroring is added to essentially flatten out the probe file. “So you have other operations that aren’t really doing anything for you functionally but are adding power. Of course, either one of those is going to add power consumption to the process. How much power? You’re not talking about big ripples. It’s maybe a few percent. The challenge is that any of these measures are making it harder to extract the information, they are not making it impossible. To make it progressively harder still, you have to add more noise or more mirroring but you’re still talking about a few percent.”

A second area where work is happening relates to counterfeiting. “The ways of combatting counterfeiting generally boil down to authentication and activation techniques. In either case, this results in additional circuitry that either operates only at the startup of the device, or essentially operates throughout the operation of the device. That carries some kind of performance and power penalties but there are techniques to mitigate this, like, for instance, by forcing these things to happen at the very front end during the power-on self-test, or when the chip is being booted or keeping the additional circuits that have been added away from critical paths,” Leef explained.

A third area of security IC work is aimed at trojans, and the prognosis currently looks grim. “People have worked on this for nearly a decade trying to detect trojans during the design phase and prevent them from entering the design, but it looks extremely unpromising at this point. People have tried simulation and formal verification-based techniques to identify trojans in the design and it doesn’t work because you are looking for unknown unknowns. You have no idea what you are looking for, and even if you find it, you don’t know if you’ve found what you were looking for. It’s looking more and more like it’s an unsolvable problem to detect trojans during the design phase. That means that the only way to deal with trojans is during runtime, and what that means is the incorporation of special IP blocks that monitor execution and look for attack profiles,” he said.

All of that is small impact but a big open and not a small impact is antiviral software and how it works on mobile applications and potentially IoT applications. The answer is, it doesn’t. “Running antiviral software would be very, very expensive in power,” Murphy said. “If you ran antiviral software on a smartphone, it would run the battery flat very quickly because of the way it works. It doesn’t seem like there’s any way that you can fix that. It’s designed to run on laptops and PCs, which don’t have to worry about power. It runs pretty fast so there might not be very much performance impact, but you would drain the battery very quickly. How to correct that is not obvious so this is a problem where I don’t have an answer. One suggestion has been to move all of the virus checking out to the cloud but then you’d introduce a lot of latency.

Measuring the impact
When it comes to adding in security features, you can bet design teams have an interest in measuring just how much the impact there will be on power and performance.

“This question comes up 100% of the time: What will it cost me? There are different ways of dealing with it. Obviously if you make the cost of security prohibitive then nobody is going to incorporate those things. In those three areas, there are different approaches to minimizing the impact,” Leef said.

For instance, to combat trojans, he said, “you could essentially force the chip to, on it’s own, simulate attacks on the first time it boots to see if the snooper detects them. After it does it for the first time and presumably does not trigger any detected problems, that particular checker can shut itself down and never turn itself on again. That’s a way of preventing power and performance impact in a normal operation of the chip where you have to pay the price one time at the – you can force the chip into this kind of mode where it self tests extensively for security.”

In all cases, however, Leef pointed out that there are ways of minimizing the power and the performance impact but you cautions that it is still very early in the game. While side-channel attacks are common right now—and there are probably half a dozen to a dozen companies that work on various ways of identifying side-channels attacks and building some kind of counter measures for them—the biggest challenges might still be ahead and unknown.

As far as how big of a hit the design will take to power and performance, Drew Wingard, CTO at Sonics, said it is easier to build an equation to look at the fraction of the work you are doing that is in an encrypted domain. “Now you are at the level of the application and what is the application doing? Generally speaking, most of the traffic on the chip – most of the operations on the chip – doesn’t need to be that secure. But there are some parts that are critically important, such as the storage locations where the keys are kept in a fashion in which they can be directly used, which is to say in a decrypted fashion, that’s got to be the most secure part of the chip. If you are doing lots of secure transactions then you’re accessing that part of the design frequently, so the impact will be much much higher and you could see things like this in some database operations associated with doing payments or some of the secure gateway type things where there are lots of encrypted packets flying around in a system.”

“From the perspective of the more consumer-style SoCs, I don’t think we expect to see meaningfully large amounts of the traffic go secure. Until you get down to some things like medical wearable devices, where maybe a lot of it is going to be secure, but the total amount of work being done is so small that the overhead is probably easily acceptable. In other words, the expected ‘on’ time for those devices is so short relative to their ‘off’ time that the real implication of having to do things in a more encrypted/decrypted fashion is probably more on battery life. So it’s not really power – it’s average power. Instead of getting this thing done in 100,000 clocks, it’s going to take me 150,000 clocks to get this thing done and therefore I’m going to drain the battery a bit faster so it’s my energy use (my power averaged over time) will be higher,” he added.

To better control the impact of implementing security features, Cadence has observed design teams implementing a number of strategies, Hardee said. “For encryption/decryption, the algorithms are well-suited to a high-level synthesis approach. We’ve seen an increase in adoption of HLS in this area in recent years. This is because design teams can quickly assess a range of micro-architectural alternatives to get optimal power and performance. For secure areas, we’ve seen cases where designs were simplified from one project to the next, to ease the security verification challenge. Some of these simplifications affect power and performance adversely, but the price is worth paying especially if the secure area is a relatively small proportion of the overall chip. Examples of such simplifications include disabling clock gating, replicating a portion of design to avoid resource sharing between secure/non-secure, disabling features from 3rd-party IPs, and using only a subset of a standard protocol (e.g. AMBA).

Further, for verification, Cadence has seen people very interested in adopting a formal verification approach. “The issue with security path verification is that it is very difficult to predict the mechanism for secure path leakage, and even more difficult to predict the security threats. So it’s close to impossible to write directed tests to verify with simulation. The exhaustive nature of formal verification becomes very appealing – every possible signal activity and combination is tried to prove that the security requirements are not violated. Once these verification issues are eased, design teams can afford to reverse those simplifications for a more optimal yet secure design.”

Software countermeasures, intersection with hardware
Interestingly, countermeasures are also done in software, Murphy said. “For example, the solution that ARM provides through a partner is a software solution so the encryption is through a software partner and they have hardening against this kind of attack. In software, you can’t add noise, you have to do mirroring things. You add either randomization or mirroring, but even in software, that’s going to be adding power. In both cases, it’s not adding noise to your mirroring, or randomizing in software, you are going to have some performance impact – again, not huge – just a few percent.”

Marc Canel, VP of security systems at ARM said, “In the ARM mobile world, the most important security feature is Trust Zone. Trust Zone is a separate execution environment that you find in all mobiles and is used to design the secure boot. So the boot starts in Trust Zone, then when the image in Trust Zone has been authenticated then the image in the virtual OS then the virtual OS takes over. That’s the first, most important feature – authenticate the software that is running…Another very important feature is content protection.”

Rob Coombs, director of security marketing at ARM explained the mixture of hardware and software is critical. “Trust Zone is the hardware isolation. A lot of security is about creating security containers. People will build on that in terms of a software such as trusted firmware and trusted operating systems. Again, Trust Zone is the hardware piece and the software piece is a trusted firmware – very deep low level secure world that controls the power state. In a multiprocessor system powering multiple processors up and down, switching between the normal world and secure world, handling interrupts and so on.”

The trusted operating system is the other piece of the software side. This is typically a small trusted kernel maybe up to a megabyte or two. That provides the basis of trust for these security applications. In normal applications, a small part might be taken out of that and moved across the secure world, so the combination of that trusted operating system and Trust Zone and trusted boot, you don’t get any security without going through initialization process which is verified. The combination of those three things creates a trusted execution environment, which is one of the layers of security that design teams are using, Coombs explained.

The biggest challenge
The biggest challenge to adding security features are marketplace requirements, Canel said. “That is the most difficult and the most challenging thing. At the end of the day, consumers typically don’t pay for security. It’s enterprises that have a regulatory mandate or it is enterprises that want to protect their reputation and they are therefore careful with what they do with the data of their customers. The challenge is fragmentation within the ecosystem. The service provider today when it manages data from its customers has to deal with devices that come from multiple vendors and their big concern is how to come up with security policies, application strategies that are consistent between all of the devices where the hardware of the device is hiding underneath a software layer that comes from a major virtual OS vendor that hides a lot of different things.”

“If I’m a major service provider…I have no real clue what is the different security model which is implemented between those chipsets. That’s where things become difficult and ARM has a role to play and come up with architectures for security that meet the needs of the major industries and help chipset vendors and OEMs roll them out in a consistent way so that then those service providers find a consistent security model on top of which they can establish a consistent liability model. That is a big challenge of security within the ARM world at large,” he concluded.