Security Becomes Much Bigger Issue For AI/ML Chips, Tools

Lack of standards, changing algorithms and architectures, and ill-defined metrics open the door for foul play.


Security is becoming a bigger issue in AI and machine learning chips, in part because the chip industry is racing just to get new devices working, and in part because it’s difficult to secure a new technology that is expected to adapt over time.

And unlike in the past, when tools and methodologies were relatively fixed, nearly everything is in motion. Algorithms are being changed, EDA tools are incorporating, and chips themselves are disaggregating into component parts that may or may not include some form of AI/ML/DL.

The result is much greater potential for mischief, for IP theft, and for frontal and side-channel attacks that can corrupt data or lead to ransomware. Those attacks may be obvious, but they also can be subtle, creating errors that may be less obvious in a world defined by probabilities, and twisting accuracy for specific purposes.

“Just defining what accurate means — and then designing an algorithm that can tell when it’s got the right answer, and prove that it’s got the right answer — is hard,” said Jeff Dyck, senior director of engineering at Siemens Digital Industries Software. “Imagine someone says, ‘Prove this AI model is right.’ I’m going to go gather some data and compare it to what the model is saying. You can do things like that. It’s not really a proof, but it’s better than nothing.”

Still, that opens the door to less obvious issues, such as programming bias. “Bias can creep in at many stages of the deep-learning process, and the standard practices in computer science aren’t designed to detect it,” said a 2019 MIT Technology Review report. “Bias can creep in long before the data is collected as well as at many other stages of the deep-learning process.

The big issue is whether algorithms have enough built-in resilience and programmability to reduce those inaccuracies, or to recover from attacks on that data. “Unlike traditional cyberattacks that are caused by ‘bugs’ or human mistakes in code, AI attacks are enabled by inherent limitations in the underlying AI algorithms that currently cannot be fixed,” said a 2019 study by Harvard’s Belfer Center. “AI attacks fundamentally expand the set of entities that can be used to execute cyberattacks. Data can also be weaponized in new ways using these attacks, requiring changes in the way data is collected, stored, and used.”

The initial challenge for many AI/ML/DL developers was to get these systems working quickly enough to be useful, followed by continual improvements in accuracy. The problem is that many of these systems are working with off-the-shelf algorithms, which don’t provide much visibility into their inner workings, or hardware that is quickly outdated. In effect, these systems are black boxes, and while there are efforts to create explainable AI, those efforts are still in their infancy.

In the meantime, many of these systems are vulnerable, and the companies using this technology have little understanding of how to fix problems when they arise. According to an ongoing study at the University of Virginia, the odds of compromising a commercial AI are 1 in 2. That compares to a common digital device using commercial encryption at 1 in 400 million.

The National Security Commission on AI (NSCAI) substantiated that finding in 2021 in a 756-page report. “The threat is not theoretical,” the agency noted. “Adversarial attacks are happening and already impacting commercial ML systems.”

However, it’s important to point out that terminology here can get confusing very quickly. ML and increasingly AI are being used to create AI chips, because they can effectively identify faulty patterns and optimize a design for performance, power, and area. But if something does go wrong in an AI chip, it’s much harder to identify the origin of that problem because these systems are generally opaque to users, and they can adapt to different use cases and applications in unexpected ways.

Adversarial attacks
A handful of white-hat hackers on, an online community of cybersecurity professionals, pointed to five potential vulnerabilities when using AI for semiconductor development. Among them:

  1. Adversarial machine learning. This branch of AI research rooted in the early 2000s, when it was used to evade AI-powered email spam filters. An adversary can manipulate an AI algorithm to create wrong subtle errors, such as an autonomous vehicle occasionally misreading GPS data.
  2. Data poisoning. Inserting faulty training data can lead to inaccurate results, but not necessarily in obvious ways. As a result, an AI-driven device may behave in ways that are unacceptable, even though the calculations appear to be correct. This is particularly insidious, because these systems may be used to train other systems.
  3. Model inversion. An attacker can use the output of the AI algorithm to infer information about the architecture of a chip. That, in turn, can be used to reverse engineer the chip.
  4. Model extraction. If an attacker can extract the model used to design a semiconductor, they can use this information to create a copy of the chip, or to modify it to introduce vulnerabilities.
  5. Supply chain. Introducing malicious modifications of a design are well understood in the security world, but modifying algorithms is much harder to detect. What exactly is the source of the commercial algorithm, and was it corrupted during download? This applies to updates, as well.

“AI systems are vulnerable to a number of attacks ranging from evasion, data poisoning and exploitation of software flaws,” said Thomas Andersen, vice president for AI and machine learning at Synopsys. “There is also growing concern about data security, as a large component of a trained AI system is the data itself.”

Better tools, limited AI
On the EDA side, though, much of what is in place today is reinforcement learning. That’s more of a tool than an autonomous system, and it’s built on knowledge from previous designs, which makes it more difficult to corrupt. “It’s not possible to say that you can stop it completely,” said Mike Borza, a Synopsys scientist. “But there’s a high probability it will be detected.”

Borza noted that AI generally is used to analyze different designs that have gone to tape-out or to silicon. “That material tends to be fairly well controlled, and the AI is very private. Our AI-based tools have large training sets of design data that are sort of best practice materials that we’ve collected over the years.”

In addition, the vetting process of those training sets is highly controlled, with very limited access, providing a good defense against compromising the designs. Add to that scan insertion, and this part of AI creation is more difficult to corrupt. “Scan insertion is really the same kind of thing that you would do for a Trojan injection, but it is not in the functional specification of the chip,” Borza explained. “So it’s lurking there in the background in virtually all digital chips these days. They all have a whole scan chain that’s used to test them during production. Somebody might try to create a Trojan in the database that’s going to be found, or should be found, during verification because you have a bunch of states that weren’t defined that are responsive, but should not be.”

Better chips, but with caveats
The AI systems that are developed using those chips are another matter, entirely. In “Not with a Bug but with a Sticker,” Microsoft data scientist Ram Shankar and Hyrum Anderson said proprietary ML-powered systems are no more secure than open-source systems. “If a user can use your ML-powered system, they can replicate it. If they can replicate it, they can very much attack it.”

One of the challenges is that many AI designs are brand new. Algorithms are still in the formative stage, which is why there are so many changes and updates. And while many design teams are aware of the known vulnerabilities, there are plenty of vulnerabilities that are still to be discovered.

“The problem is there’s another universe of problems out there that we don’t have a good way to screen for,” said Frank Huerta, CEO of Curtail, a development tool provider. “The premise is if that data set is large enough, you think you’ve got everything. But do you? If you don’t, what do you do? With security, an adversary is trying to screw up your analysis. Hackers are using AI-based attacks to intentionally thwart what you’re doing with this kind of large data model approach.”

Overconfidence in automated systems has long been a problem in security. Shankar and Anderson presented multiple studies showing people are more likely to follow directions of an automated system that has previously demonstrated vulnerabilities. “It is not AI’s failure to meet expectations,” they wrote. “It’s that we have very high expectations in the first place. The problem is that in many settings we over-trust it.”

Backing up their observation, the Allen Institute for AI surveyed more than 1,500 Americans in 2021, including software engineers with advanced degrees, to measure understanding of AI capabilities. It concluded that 85% of those surveyed were “AI illiterate”.

Over trust vs. zero trust
Zero trust is getting a lot of attention in the security community these days, but the bigger problem may be over-trust.

“Years of cyberattacks have taught us one incontrovertible lesson,” said Shankar and Anderson. “Where there is over-trust, there is always a motivated adversary ready to exploit it.”

One way around that is to develop standards, and those standards need to include the hardware as well as the software. “The Defense Department has been saying you need quantifiable security measures that you can count on,” said Raj Jammy, chief technologist at MITRE Engenuity and executive director of the Semiconductor Alliance. “That’s why provenance of the chip becomes much more critical than before — and not just the chip, but provenance of all the components that go into the chip, including the design pieces like the IP blocks. Many times people just re-use IP blocks. Well, what happens when we don’t update it? Did we do the right fixes that we discovered before? What is the version that you’re maintaining? And how do you take care of that. So standard of care is very critical.”

The group recommended using multiple AI algorithms and diverse sets of training data to reduce the risk of attacks. The reason is an adversary may manipulate one AI algorithm with other algorithms detecting the manipulation to prevent a flawed design. If an attacker can infer the layout or architecture of a semiconductor from the output of one AI algorithm, they may not be able to do so from the output of another algorithm.

Making AI algorithms explain their decision-making process in a human-understandable way is another defense the team recommends. If an AI algorithm can explain how it arrived at a particular output, it can make it more difficult for an attacker to infer information about the semiconductor design from the output.

The team agreed that thorough testing and verification help ensure a semiconductor design is secure and performs as expected. For example, testing and verification can help detect the presence of a backdoor or other vulnerability in the semiconductor design. And using secure hardware and software can prevent supply chain attacks by ensuring that the semiconductor design process is secure from start to finish and prevent attackers from manipulating the AI algorithms used in semiconductor design.

Different approaches
As weaknesses in AI become known, innovative approaches are being taken to address them.

One such approach, taken by startup Axiado, is to run AI on bare metal. “Typically, an AI/ML model is stored on top of a system (e.g., Linux),” said Axiado CEO Gopi Sirineni. “Axiado’s models reside on bare-metal that is inherently secure, because lots of the vulnerabilities come from the higher level of the system. Our ML pipeline continuously builds data lakes with various vulnerability and attack data sets.”

Another approach is to store critical information inside secure flash memory, so in case an algorithm or any part of a device is breached, it can be reset and rebooted. “Secure flash protects data and it protects code, so it takes care of confidentiality, integrity, and authenticity of software in boot code and so on,” said Adrian Cosoroaba, technical marketing manager for security at Winbond. “Incidents are taking place on a weekly basis, which is why industry associations and organizations have put in place security certifications to ensure that manufacturers making secure products are indeed living up to the standard.”

Similarly, programmability can be added into AI chips to ensure they can be reconfigured as needed, either in the fab or the field, to take advantage of changes in algorithms or security standards. But if AI is used deep inside an automotive subsystem, attacking it will be far different from attacking a mainstream platform.

“Hacking x86 processors is one thing,” said Geoff Tate, CEO of Flex Logix. “It’s a very well-understood architecture with a ton of documentation, and it’s talking to the outside world. But a lot of inferencing chips are buried inside a car, and every inference chip has a very different architecture. These architectures are very openly published, but try going on anybody’s website and learning anything about any level of detail in an internal architecture. It’s far more opaque. So the security concerns car companies will be focused on first will be x86 and RISC-V processors.”

Secure the data
Another approach is to secure access to the data. Generative AI programs require access to a large amount of data to learn and generate new content. It’s important to secure this data so that it can’t be accessed by unauthorized users. This can be done through data encryption, access control, and monitoring.

That can be difficult, however. Data at rest can be encrypted. Data in transit can be encrypted. But until recently, data in use has been vulnerable. Not only can adversaries install spyware to monitor and exfiltrate design data, but the popularity of AI as a way to clean up code has put proprietary code into public use.

A cybersecurity startup in Ireland, Vaultree, claimed to have resolved this problem with a tool that can encrypt data in use. Vaultree CEO Ryan Lasmaili said the company’s tool can be used even with a public generative AI service, as long as it is with a registered business account. “Data shared publicly on public servers becomes open source, essentially,” he said. “But on a business account with the Vaultree tool, that sensitive information is encrypted and processed in its encrypted form.”

Cornami has been working on a similar approach with homomorphic computing. The problem it is working on is the amount of compute horsepower required to process encrypted data. Cornami is targeting mission-critical applications such as banking, health care, pharma, and insurance, where breaches can be devastating.

In the end, it all comes down to a realistic view of the capabilities and limitations of machine language models. According to Dan Yu, AI/ML product solutions manager at Siemens Digital Industries Software, machine language models are limited by the scope of the training data. “You have to know what those limits are,” he explained. “We want to use AI to improve productivity, but not replace human involvement.”

Yu places machine language tools at the front end of the design, and contends they must be verified the same way chips were verified before AI became popular. “It’s still a human sitting there to decide that the quality is good.”

But who defines what is good enough, and for how long, is becoming more difficult to define, and that opens the door to very different security risks than the chip industry has dealt with in the past.

—Ed Sperling contributed to this report.

Leave a Reply

(Note: This name will be displayed publicly)