Security Holes In Machine Learning And AI

A primary goal of machine learning is to use machines to train other machines. But what happens if there’s malware or other flaws in the training data?


Machine learning and AI developers are starting to examine the integrity of training data, which in some cases will be used to train millions or even billions of devices. But this is the beginning of what will become a mammoth effort, because today no one is quite sure how that training data can be corrupted, or what to do about it if it is corrupted.

Machine learning, deep learning and artificial intelligence are powerful tools for improving the reliability and functionality of systems and speeding time to market. But the AI algorithms also can contain bugs, subtle biases, or even malware that can go undetected for years, according to more than a dozen experts interviewed over the past several months. In some cases, the cause may be errors in programming, which is not uncommon as new tools or technologies are developed and rolled out. Machine learning and AI algorithms are still being fine-tuned and patched. But alongside of that is a growing fear that it can become an entry point for malware, which becomes a back door that can be cracked open at a later date.

Even when flaws or malware are discovered, it’s nearly impossible to trace back to the root cause of the problem and fix all of the devices trained with that data. By that point there may be millions of those devices in the market. If patches are developed, not all of those devices will be online all the time or even accessible. And that’s the best-case scenario. The worse-case scenario is that this code is not discovered until it is activated by some outside perpetrator, regardless of whether they planted it there or just stumbled across it.

“Because it’s opaque, you’re inventing all kinds of new attack modes that are quite interesting from the definition of security,” said Rob Aitken, an Arm fellow. “You can think of it from secure data transmission, which is one level of security. But you also can think of it in terms of mission. For example, if you develop a machine learning security camera, where you train it at the factory to look for a particular hat and shirt and shoes, then no one else will register as a person. So now anyone can walk up to your front door and break in. You’ve created a new problem that didn’t exist before you had a machine learning camera. It’s a back door that you’re building with the data itself. We can have less contrived examples of back doors, but the fact that you have jurisdictions that allow back doors and those that don’t means you have to define what you mean by security at all times.”

Security risks are everywhere, and connected devices increase the ability of more attacks from remote locations. But AI, and its subsets of machine learning and deep learning, add new attack points on the threat map because machines are used to train other machines, and no one is quite sure when or how those trained machines ultimately will use that data. That gives someone with a deep understanding about how training algorithms will be used a big advantage when it comes to cybersecurity and cyberespionage.

“In our cybersecurity group we spend a lot of time worrying about AI fighting AI,” said Jeff Welser, vice president and lab director at IBM Research Almaden. “Having systems where the infiltrator themselves is writing programs that are AI-enabled to learn what’s happening in patterns, and therefore be better able to infiltrate that, is absolutely an area that we are doing research on right now. It really is going to be, in a sense, AI versus AI. In order to fight those, you probably also need to have systems that are looking across the networks and looking for activity and learning themselves what activity is normal and not normal for that network. So then they’re more likely to be able to pick up on, ‘Hey, something odd is happening here in a subtle way.’ It’s not like the obvious call to some random port that you’re used to seeing. It’s something else that’s happening.”

That isn’t necessarily instantaneous cause and effect, either. Sometimes the impact may take years to show up, like in car or airplane navigation systems where the back door becomes a tool for ransomware.

“We’ve been using AI in the cybersecurity area for helping to monitor traffic patterns on the networks to see if something’s anomalous,” said Welser. “Originally it was for things where you might have someone who’s put some code in there that is communicating out. But it can be sneaky, because it doesn’t have to communicate out every hour, which would allow us to easily see a pattern. It’s probably going to do it randomly. So then you have to be able to watch and see much more subtle patterns, which is what AI is good at doing. But the next level is where the code in there is actually AI code. So it’s learning, as well, what’s happening. And therefore it’s going to be even harder to track. It may not show up for 5 or 10 years, depending on the patience of the persons involved. That will be the next battlefield for cybersecurity. “

There is some agreement on this point, at least for now. Machine learning, deep learning and AI are just being rolled out across a number of markets and applications. And while most people understand how these approaches can solve problems and improve quality control in areas such as manufacturing and chip design and verification, there is far less understanding about exactly how all of this works, particularly after systems begin to learn certain behaviors.

“What’s a much more prevalent attack in the short run is accelerating the time to debug or the time to exploit using machine learning from the outside-in,” said Martin Scott, CTO at Rambus. “A lot of people are doing that. But you also can do it for a malicious intent. We’re starting to look at it as a quick way to recognize anomalous behavior. Maybe there’s a particular attack where there’s a signature, and it’s not just one device behaving in a certain way. What if you notice devices correlated in time starting to act in an anomalous way in an identical way. That’s a sign that perhaps you’ve lost control and there’s some coordinated attack that’s a precursor to something big. So there are a lot of things that are a precursor to ML, where you say, ‘That doesn’t look right.’ It can trigger a response to revoke connectivity or send an alert. I see more of that activity than embedded latent code in machine learning.”

Understanding the risk
There are some obvious places to begin closing up security holes in machine learning and AI systems. One involves restricting access to algorithms, similar to what is done with ordinary network traffic today.

“There are a few aspects you need to get it right,” said Raik Brinkmann, CEO of OneSpin Solutions. “One is authentication, to make sure that the device sending back data is the device you want to talk to. In silicon, you need to know this particular chip is the one you’re talking to. There are some IP companies targeting that question. How do you burn an ID into something that you deploy? And only when it is activated at the customer do you get that ID, rather than in the factory. You can associate the data source with this chip. Then there are technologies like blockchain to make sure that the data flowing from this device continuously is authenticated and the data that you expected. Tamper resistance is important on the data stream. You need to control the flow of data and guarantee integrity, or you will have a big security problem.”

Unlike more conventional electronic systems, though, AI/ML/DL systems are by nature more resilient. Rather than conventional processing, where a system will stall or crash if it cannot produce an exact answer, AI/ML/DL systems generate results that fall into a distribution. That provides some cushion if something doesn’t exactly fit, which is useful in adjusting to real-world changes such as identifying an object in the road. But it also makes it harder to pinpoint exactly where the problem is.

Neural network algorithms are not that different from other bodies of data and software,” said Chris Rowen, CEO of Babblelabs. “All the same issues around verification and protection apply. But one thing that could be different is that if you take the average body of software and you flip a random bit, there’s a good chance it will cause the thing to crash. On the other hand, if you take a neural network, most of whose bits are parameters in the network, if you flip a bit it’s going to happily keep executing. It may have a slightly different function or a completely different function, but there is no kind of internal consistency.”

And that’s where the real problems can begin. Understanding security in a connected system is bad enough.

“Anytime you’re adding 4G or 5G connections, you’re giving it access to the hardware systems,” said Patrick Soheili, vice president of business and corporate development at eSilicon. “Whether it’s AI or not AI, you’re creating an opportunity for the hacker to come in and manipulate something. If you build enough authentication around the semiconductor device, when you need it, for whatever you need to get done, then you’ve got a security problem.”

But it gets much more complicated when that connected system involves AI/ML/DL.

“You need to define what the threat model is, and then ideally you need some sort of metric for security,” said Arm’s Aitken. “Those metrics are really hard to come by because if you go to your boss and say, ‘I need to put extra security into this chip and it’s going to cost three months of schedule,’ and your boss says, ‘What do I actually get for that?’ The answer is, ‘Well, it will be more secure.’ But how much more secure and in what way? Having these metrics for security is really key. Then there is the blockchain piece. The level of authentication and providence that you need for different data changes. It’s similar to when I go to the store and buy a pen, I assume that the store got the pen legitimately and doesn’t care what I do with it after I buy it. But if I buy a car, you need to know everyone who owned that car and the history of that car, and when I sell it the state wants to know who I sold it to. They track it at a much more detailed level than a pen. The same value chain holds for data.”

What’s different?
Security issues have been rising everywhere. In addition to the long-running threats inherent in application software and networking access, the introduction of Meltdown and Spectre have identified flaws in the x86 architecture that were not even on the radar when that architecture was developed. And the idea that cars can be hacked and networked seemed almost absurd a decade ago.

But entire systems are being connected to other systems, and that opens up access on a global scale to everyone from troublemakers to sophisticated organized cybercrime organizations and nation states.

“The shift that we’ve already made is from scaled complexity to systemic complexity, where a scale is really Moore’s Law—the classic one—of just more transistors on a chip,” said Aart de Geus, chairman and co-CEO of Synopsys. “You now have many chips, many systems, many software environments all interacting, so we’re deep into systemic complexity. The very fact that system complexity itself is particularly well suited for AI approaches — because it’s not some logic right/wrong answer to a lot of things, it’s more like ‘see the patterns’ — is also a challenge from a security point of view. These are all progress steps that bring their own challenges.”

The Mirai distributed denial of service (DDoS) attack in October 2016 provided a glimpse of just how widespread the attack surface can be. Using a botnet, three college students managed to infect several hundred thousand devices around the globe and use them to overload the Internet’s backbone.

The fear is that something similar can happen with AI/ML/DL, because as machines are used to train other machines, the machines themselves actually spread the problem. Only in this case it isn’t a classic virus. It’s the inner workings of the algorithms that drive these systems, which makes the problem much harder to identify. Rather than looking for a single security flaw, security experts will be searching for unusual patterns, in the best of cases, and well-accepted patterns of behavior in the worst cases.

Solving problems
It’s still to early to tell what is the best approach for solving these issues. AI/ML/DL are still in their infancy, despite the fact that they have been researched in one form or another since the 1950s. But it wasn’t until this decade that the market for AI/ML/DL really got going due to a confluence of several factors:

• There is enough processing power and memory to process AI/ML/DL algorithms, both in data centers for training and at the edge for inferencing.
• There are real applications for this technology, so there is money pouring into developing better algorithms and more efficient hardware architectures.
• Technology has enabled algorithms to be developed on computers rather than by hand, allowing companies to start with off-the-shelf algorithms rather than trying to develop their own.

All of this has enabled AI/ML/DL to pick up where research by large computer companies such as IBM and Digital Equipment Corp. left off in the early 1990s. Since then IBM has continued its efforts, joined by cloud providers such as Amazon, Microsoft, Google, as well as Alibaba, Facebook, and scores of smaller companies. In addition, there are billions of dollars being poured into research by governments around the globe.

Fig. 1: Where investments are going in AI, in billions of dollars. Source: Mckinsey & Co. report on Artificial Intelligence: The Next Digital Frontier?

According to a new Brookings Institution report, AI investments are growing across a wide swath of markets, including national security, finance, health care, criminal justice, transportation and smart cities. The prize, according to PricewaterhouseCoopers, is $15.7 trillion in potential contribution to the global economy by 2030.

Fig. 2: Which regions gain the most from AI. Source: PWC

With that kind of payoff, AI/ML/DL are here to stay. And that opens the door for tools vendors such as EDA companies to automate some of the security checks, particularly on the verification side, which all vendors say they are currently working on.

“If I verify this product and I sell it, and it changes behavior, it’s no longer verified,” said Wally Rhines, president and CEO of Mentor, a Siemens Business. “What do we do about that? What does a car manufacturer do? Those problems are being addressed. We introduced a product that lets you design into your integrated circuit the ability to dynamically test any subsystem that is JTAG-compatible, so when your chip is not doing anything, it’s self-testing against a set of criteria that the system manufacturer has established so that it can have a dynamic lifetime of self-test. You’ll see the same thing evolving over time as you get more and more neural networks and apply machine learning to chips, boards and systems. Then you’ll get more ways to verify that they haven’t modified themselves into a space that could be dangerous or non-functional.”

Put another way, fixing this problem has a huge upside for companies that can automate it, and the EDA industry is well aware of the opportunity not only for using AI/ML/DL internally, but also for developing tools that can tighten up the development and security of the algorithms. But until then, it’s not clear how secure the existing algorithms are or exactly how to fix them. And at least for the foreseeable future, that may—or may not be—a big problem.

Related Stories
Machine Learning’s Limits
Experts at the Table, part 1: Why machine learning works in some cases and not in others.
IBM Takes AI In Different Directions
What AI and deep learning are good for, what they’re not good for, and why accuracy sometimes works against these systems.
Machine Learning’s Limits
Experts at the Table, part 2: When errors occur, how and when are they identified and by whom?
Applying Machine Learning To Chips
Goal is to improve quality while reducing time to revenue, but it’s not always so clear-cut.
Deep Learning Spreads
Better tools, more compute power, and more efficient algorithms are pushing this technology into the mainstream.
Machine Learning’s Growing Divide
Is the industry heading toward another hardware/software divide in machine learning? Both sides have different objectives.
EDA Challenges Machine Learning
Many tasks in EDA could be perfect targets for machine learning, except for the lack of training data. What might change to fix that?

Leave a Reply