Why It’s So Difficult — And Costly — To Secure Chips

Threats are growing and widening, but what is considered sufficient can vary greatly by application or by user. Even then, it may not be enough.

popularity

Rising concerns about the security of chips used in everything from cars to data centers are driving up the cost and complexity of electronic systems in a variety of ways, some obvious and others less so.

Until very recently, semiconductor security was viewed more as a theoretical threat than a real one. Governments certainly worried about adversaries taking control of secure systems through back doors in hardware, either through third-party IP or unknowns in the global supply chain, but the rest of the chip industry generally paid little heed apart from the ability to boot securely and to authenticate firmware. But as advanced electronics are deployed in cars, robots, drones, medical devices, as well as in a variety of server applications, robust hardware security is becoming a requirement. It no longer can be brushed aside as a ‘nice-to-have’ feature because IC breaches can affect safety, jeopardize critical data, and sideline businesses until the damage is assessed and the threat resolved.

The big question many companies are now asking is how much security is enough. The answer is not always clear, and it’s often incomplete. Adequate security is based on an end-to-end risk assessment, and when it comes to semiconductors the formula is both complex and highly variable. It includes factors that can fluctuate from one vendor to the next in the same market, and frequently from one chip to the next for the same vendor.

Much of this growing concern can be mapped against the rising value of data, which often is coupled with new or expanding market opportunities in automotive, medical, AR/VR, AI/ML, and both on-premise and cloud-based data centers. This is evident in the growth of the big data and business analytics market, which is expected to grow 13.5% annually from $198.08 billion in 2020 to $684.12 billion in 2030, according to Allied Market Research. There is more data that needs to be processed, and with Moore’s Law running out of steam, chipmakers and systems companies are innovating around new architectures to optimize performance with less power for different use cases. That makes it much harder to determine what’s an acceptable level of security for each application.

“It varies widely,” said Steve Hanna, distinguished engineer at Infineon. “The attacker generally is not going to spend a million dollars to get a thousand dollars in return. But there are some attackers who are willing to do that, although they tend to be motivated by other goals — government espionage, terrorism, or even people taking revenge on someone. And you can’t always assume no one will bother to attack something, because there can be indirect attacks, too. Why, for example, would someone want to hack a lightbulb in my house? One answer is because it can become a bot in their botnet army, and then they can lease that botnet army to other people. Then you have attacks from a million different points and it brings down a server.”

Many attacks can be prevented, or at least fixed with a system reboot. But there are costs associated with those actions, including some that are not obvious. Actively policing a chip to identify unusual activity requires power, which in turn can reduce battery life in mobile devices such as a phone or smart glasses, or an implantable medical device such as a pacemaker. It also can affect performance in devices, because it requires extra circuitry dedicated to making sure a chip is secure — basically the equivalent of guard-banding, but with little or no hard data to prove how effective it will be.

Building security into an IC also makes the design more complex, which in turn potentially adds other vulnerabilities that may be unique to a particular design. Chip architects and design teams need to understand the implications of every security measure on the movement and capturing of data, as well as the impact of ECOs and other last-minute changes needed to achieve sign-off.

In the past, this was a secondary consideration, because most attacks happened with software, which could be hacked remotely. But as more hardware is connected to the Internet, and to other hardware, chips are now a source of concern. Unlike software, if an attacker gains access to the hardware, a system reboot may not be possible.

“There is a whole business of building and selling tools to attackers,” said Hanna. “They have tech support and documentation and sales reps, and there’s a whole supply chain of tools and GUIs for mounting your attack. Usually, these operations are run out of places where there’s no extradition treaties.”

Even if a device starts out secure, that doesn’t mean it will remain secure throughout its lifetime. This became evident with vulnerabilities based on speculative execution and branch prediction, two commonly used approaches to improve processor performance prior to the discovery of Meltdown, Spectre, and Foreshadow. Now, with complex designs making their way into automotive, medical and industrial applications, where they are expected to be used for up to 25 years, security needs to be well architected and flexible enough to respond to future security holes and more sophisticated attack vectors.

“If you’re building network equipment, for example, it’s not just about the chip or the software,” said Andreas Kuehlmann, CEO of Tortuga Logic. “Their box is going to be out there for tens of years. What’s the total cost to maintain my product over its lifecycle? That cost reflects the cost that I have, the cost I’m imposing on my customers, and the cost if there’s any incident. The auto industry, in particular, really understands that because they look at records as part of their business. They’ve taken risk management to a level that nobody else has.”

For the automotive industry, and increasingly the medical industry, a breach can be extremely costly in multiple ways, from customer confidence in the brand to liability based upon insufficient security in a piece of hardware that results in injuries. “Security has an indirect impact on safety,” said Kuehlmann. “Safety is an extremely well-understood process, but it also raises some business issues. What’s my liability? What is the cost of a recall? It also affects privacy, particularly with medical records. That has a direct business impact, as well.”

Fig. 1: Securing chips in the design phase. Source: Tortuga Logic
Fig. 1: Securing chips in the design phase. Source: Tortuga Logic

And that’s just the beginning. These are relatively well-understood threat models. Other industry segments are far less sophisticated when it comes to chip security. And as more devices are connected to each other, often crisscrossing silos in various industry segments, the threat level increases for all of them.

Designing for security
Reducing the risk of potential hardware breaches requires a solid understanding of chip architectures, including everything from partitioning and prioritization of data movement and data storage, to a variety of obfuscation techniques and activity monitoring. Getting all of that right is a complex undertaking, and it’s one for which there often is no clear payback. A chip that is difficult to hack may deter attackers, and the best outcome is that nothing unusual happens — which can make companies question why they expended the necessary effort and money needed to secure a device.

That, in turn, tends to instill a false sense of confidence and lead to bad security choices. “A lot of times people use a normal applications processor to run cryptographic or security algorithms,” said Scott Best, director of anti-tamper security technology at Rambus. “So they are running security algorithms on an insecure processor, which is one of the hallmarks of a security failure. A processor is optimized, like any other circuitry. You can optimize it for power, for performance, or for security, and to think you’re going to accidentally get any one of those three benefits without actually focusing on them is recklessly optimistic. None of those things happens by accident.”

Some chipmakers and various organizations and government agencies are beginning to recognize that. “There are a couple of different ways security is creeping into the designs,” said John Hallman, product manager for trust and security at OneSpin, a Siemens Business. “It really is becoming more of a requirements-driven process, and that’s good. It’s moving toward that initial development stage. I wouldn’t say we’re completely there, or that the entire system is thought out in terms of which processors or which components are going to be in your system. But we are starting to establish at least some semblance of threat vectors, and how you can address perceived threats early enough. There is some due diligence in initial protections, which sets you up for the most success you can have at that stage. Then there are points along the way that you can continue to evaluate as you get into the design phase, where in your front-end design you’re doing some type of HDL coding, which you can continue to evaluate. So not only did you meet the requirements set forth and put in those protections, but now you’re starting to introduce some of these known vulnerabilities. There’s a lot being reported today in the hardware space. These are things you can check for and incorporate as part of your verification process.”

For example, MITRE, which is funded by various U.S. government agencies, publishes a list of the most important hardware weaknesses. In Europe, the European Union Agency for Cybersecurity (ENISA) publishes a threat landscape for supply chain attacks, as well.

That’s a starting point. Less visible is the impact of different use models on security. This is especially true for an increasing number of devices that have some circuits that are always on, a power-saving technique that allows devices such as smart speakers or surveillance systems to wake up as needed.

“In the case of an always-on machine, it’s doing background monitoring tasks while trying to consume as little power as possible,” said George Wall, director of product marketing at Cadence. “But it’s also vulnerable to unauthorized code executing on it, or some other type of security attack. So it’s important that when it boots up, the code that it is booting from is good. What are the authentication steps that are required for it to be considered a secure state? What other resources need to be protected from unauthorized code or misbehaving third-party code running on an always-on processor? Those considerations need to be built in upfront because they’re difficult to shoehorn in later.”

These kinds of best practices need to be standardized for chip design, similar to the emergence of design for test and design for manufacturability as requirements.

“There’s been a push for metrics in standards committees,” said Hallman. “You can look at the number of protections you put in, the coverage numbers you can measure, and the number of vulnerabilities that are out there for pieces of code or certain products. So we’re starting to at least quantify some of the pieces that we’re looking at. We’re not there yet. We still need to determine which data means the most, and are we measuring the right things? That’s going to be researched for a while. But at least we’re starting to use more scientific practices when we try to evaluate what we really value as security.”

Fig. 2: Secure chip architecture. Source: DARPA
Fig. 2: Secure chip architecture. Source: DARPA

Heterogeneneous challenges
All of this becomes more difficult as chipmakers embrace more customization and heterogeneity. As device scaling becomes more expensive, and the power, performance and area/cost benefits continue to shrink with each new node, architects have begun to package more components together. That creates a new and different set of challenges involving security. Not all of the components are inherently secure, and it’s not always clear which ones have been designed with security in mind because many of these customized accelerators and IP blocks are developed and/or sold as black boxes.

“If you want to go move loaded data around, you want to process that data, you want that data to be secure, you want it to be obviously handled through the right kind of memory management, and all those kinds of things,” said Peter Greenhalgh, vice president of technology at Arm. “Whatever piece of hardware you design relies on multiple other layers underneath. That volume of hardware raises the bar for when you build something to accelerate data. It’s kind of like building bigger castles to manipulate that data. So you’ve got a CPU, GPU, compute accelerator, and you need to make them bigger, with higher performance, and with more flexibility. If you’re going to try to construct lots of different smaller pieces of IP or smaller components to manipulate data in the most efficient way, that might work in an academic environment. But when you get into a consumer environment or commercial environment, you realize you need Linux, virtualization, security, debugging, performance management, etc. Suddenly, all of these bespoke accelerators, which are brilliant because they can manipulate the data and handle it seamlessly, tend to grow and grow. But you’d be better building three or four different castles that are flexible enough to be able to handle all the different ways that I want to do this in the future.”

Jayson Bethurem, product line manager for Xilinx’s cost-optimized portfolio, pointed to similar concerns. “When you’re bringing lots of signal data into a device, and then back out of the device, customers are asking for more multi-level security,” he said. “We have to be able to encrypt the data coming in and encrypt the data going out. We may be able to reprogram your device and make sure it’s coming from an authenticated source. And finally, we need to protect the IP that’s inside this device through cryptography and IP theft protection, with DPA (differential power analysis) resistance and things like that. All the security features that exist in our high-end devices need to be available in a low-cost FPGA.”

The big challenge here is building in flexibility for optimization without sacrificing security, and doing it quickly enough and without burning up too much power. “You want the hardware assurance that the software is behaving correctly,” said Rambus’ Best. “And if that root of trust is in a sometimes-on system, because it’s a mobile system that wakes up when something needs attention, then it creates a security problem. An adversary can always figure out what wakes up a system. And if the system needs to wake up and behave securely, that’s similar to what your phone does. It needs to get itself into a secure execution place. You have to go find the secure state information that got saved off the low-power memory. It’s just sitting there quiet and idle, so that can get quickly loaded, and you can now have a secure environment without going through a full minute of secure boot.”

AI systems add a a whole different level of complexity because none of them behave the same way. In fact, that’s the whole point. These devices are supposed to optimize themselves to whatever tasks they are designed to handle. That makes spotting aberrations in their behavior more difficult, and it’s extremely difficult to reverse engineer problems when they do occur to really figure out whether an abnormality was due to an unusual data set or training data flaw, or whether it is responding to malicious code.

“You can embed a behavior in AI that people are not expecting, but that can be triggered by the person who invented it,” said Mike Borza, security IP architect at Synopsys. “You’re introducing this in the training data, which means you’re adjusting the connectivity and the weights between neurons — how those neurons respond to things in their environment. It’s very difficult to understand what training data is going to do. And that has been one of the challenges. We’re looking now at ways to enhance the observability and controllability of it, and to have these devices provide feedback about how they’re making their decisions so that you can diagnose them when they start misbehaving. It’s very easy in this kind of scenario to embed some behavior that can be triggered by the right set of inputs, or the right sequence of inputs, or the right collection of images, and to produce a behavior the adversary wants that is undesirable behavior for the AI itself.”

Extending security forward and backward
The bigger challenge may be the longevity of a hardware design. Security needs to be end-to-end, both physically in the supply chain, the design-through-manufacturing chain, and in the field throughout the projected lifetime for a particular chip or system. This is difficult to do with components manufactured in different regions of the world, often by different companies, and with some of it black-boxed. And it becomes even more difficult if the chip is expected to work according to spec for a decade or two.

One approach that is gaining traction is to use an embedded FPGA, which can be reprogrammed to deal with new threats. This is particularly useful in dealing with threats to chips with longer lifetimes.

“Most of what eFPGAs have been used for so far is obfuscation,” said Andy Jaros, vice president of sales at Flex Logix. “The chip is not functional until it’s programmed by the end user, and that’s where they can add their secret sauce. So we’re seeing proprietary algorithms being added in that people don’t want to expose to the supply chain. We’re also seeing some new applications, because it’s just a matter of time before AES (Advanced Encryption Standard) is cracked using quantum computers. With an eFPGA, you can build the chip now and future-proof it.”

While it costs more up front to design in this type of flexibility, it can be amortized over the life of a chip. And with constant changes in security to deal with new attack capabilities, the economic benefits of this approach can be significant. “We’re also seeing new ways to protect the bitstream through encryption,” Jaros said. “You put the decryption engine in the eFPGA and you can reprogram it as needed. You can even dynamically reconfigure the bitsteam every few hundred microseconds, which makes it much harder for attackers to read.”

Future-proofing chips for security is a huge concern. “We’re starting to see a whole bunch of things that are interesting to combine together into what we’re calling silicon lifecycle management,” said Borza. “It’s the embedding of sensors in a chip to allow you to look at its behaviors and measure how it’s performing, what it’s doing, and what might be happening on a security front. There are certain behaviors that you expect from that chip, and you generate a pattern of measurements that you can make with sensors. And if you have a lot of data about how these chips behave over time doing certain applications and what their average behaviors are, what their behavior is while doing certain tasks, then you can spot things that may involve attacks or failures, or which are indicative of an attack that’s evolving.”

And ultimately, even the best security may not be fully utilized. It often depends on who’s motivated to implement it and who isn’t. “When you get down to their customers — the ones deploying the IoT devices — they have to ask how much security is worth to them,” said Infineon’s Hanna. “And then, how do all of these people judge the security of their suppliers? That gets into certifications and standards. And then there’s a role for governments, as well. The U.S. government has a real interest in making sure our infrastructure isn’t compromised.”

 

Fig. 3: Security is a global problem. Source: Infineon

Fig. 3: Security is a global problem. Source: Infineon

Conclusion
Security will remain a challenge going forward. For decades, the chip industry managed to downplay security because it was much simpler to hack the software than the hardware. But the price of equipment needed to reverse engineer a chip is now affordable to many organized crime groups and governments, and the payback through ransomware and distributed denial of service attacks for increasingly valuable data is rising quickly.

“If you look at anything in the mechanical world, you have some Gaussian distribution for how many parts out of a million will go wrong,” said Tortuga Logic’s Kuehlmann. “That’s typical liability metrics. But if someone finds a vulnerability and executes a massive attack on every car, it’s a step function. Until you find that nobody is vulnerable, then suddenly everybody has a problem.”

Security experts are optimistic this will lead to better hardware, not to mention spawn other businesses. “On the defender side, the company that builds the device has its own economic model for what they’re willing to invest in security,” said Infineon’s Hanna. “That could be a free service, or it could be a value-added service. But if it’s an IoT device manufacturer, are they willing to pay a little more for a secure microcontroller, or are they willing to hire people or outsource this?”

The answer to that question depends on a variety of factors, not all of which are consistent, universal, or even well understood.

Related
Semiconductor Security Knowledge Center
2021 CWE Most Important Hardware Weaknesses
MITRE has just released the list of most important hardware weaknesses that lead to security vulnerabilities.
Creating IoT Devices That Will Remain Secure
Strategies for integrating secure devices into less-secure and legacy environments.
Complex Chips Make Security More Difficult
Why cyberattacks on the IC supply chain are so hard to prevent.
Always On, Always At Risk
Chip security concerns rise with more processing elements, automatic wake-up, over-the-air updates, and greater connectivity.



Leave a Reply


(Note: This name will be displayed publicly)