Machine Learning’s Limits (Part 1)

Experts at the Table, part 1: Why machine learning works in some cases and not in others.


Semiconductor Engineering sat down with Rob Aitken, an Arm fellow; Raik Brinkmann, CEO of OneSpin Solutions; Patrick Soheili, vice president of business and corporate development at eSilicon; and Chris Rowen, CEO of Babblelabs. What follows are excerpts of that conversation.

SE: Where are we with machine learning? What problems still have to be resolved?

Aitken: We’re in a state where things are changing so rapidly that it’s really hard to keep up with where we are at any given instance. We’ve seen that machine learning has been able to take some of the things we used to think were very complicated and rendered them simple to do. But simple can be deceiving. It’s not just a case of, ‘I’ve downloaded TensorFlow and magically it worked for me, and now all the problems I used to have 100 people do are much simpler.’ The problems move to a different space. For example, we took a look at what it would take to do machine learning for verification test generation for processors. What we found is that machine learning is very good at picking from a set of random test programs the ones are more likely to be useful test vectors than others. That rendered a complicated task simpler, but it moved the problem to a new space. How do you convert test data into something that a machine learning algorithm can optimize? And then, how do you take what it told you and bring it back to the realm of processor testing? So we find a lot of moving the problem around, in addition to clever solutions for problems that we had trouble with before.

Rowen: Machine learning and deep learning are giving us some powerful tools in what, for most of us, is an entirely new area of computing. It’s statistical computing, with the creation of very complex models from relatively unstructured data. There are a bunch of problems that historically have been difficult or esoteric, or really hard to get a handle on, which now we can systematically do better than we were ever able to do in the past. That’s particularly true when it’s in the form of, ‘Here’s some phenomenon we’re trying to understand and we’re trying to reproduce it in some fashion. We need an approximate model for that.’ There are a whole bunch of problems that fall into that domain. This new computing paradigm gives algorithm designers and software developers a new hammer, and it turns out it’s a pretty big hammer with a wide variety of nails, as well as some screws and bolts. But it is not universal. There are lots of kinds of problems that are not statistical in nature, where you’re not trying to reproduce the statistical distribution you found in some source data, and where other methods from AI or other classical methods should be used. It is fraught with issues that have to do with a lack of understanding of statistics. People don’t entirely understand bias in their data, which results in bias in their model. They don’t understand the fragility of the model. You generally can’t expect it to do anything reasonable outside of the strict confines of the statistical distribution on which it was trained. And often, today, these models fail the reasonableness test. You expect them to know things they can’t possibly know because they weren’t trained for that. One of the big challenges for all of this is to not only train the deep learning models for the behaviors that you want, but to also view some of the reasonableness principles. If you’re dealing with visual systems, you want them to know about gravity and the 3D nature of objects. Those tend to be outside the range of what we can do with these models today.

Brinkmann: I agree. It’s a powerful tool in the engineering toolbox of any company that does scientific or technical work. The best applications of this are when you control the space you’re using it in, and when a human is still in the loop to tell the difference between when it behaves reasonably and when it does not. So whenever you optimize your business processes or your test vectors, or something you don’t understand the nature of, it may be a good use of machine learning. But when people have certain expectations, and it hits data it may not have seen before, it starts to fail. That’s where the trouble starts. People believe that machine learning is a universal solution. That’s not the case. We need to make sure people understand the limits of these technologies, while also removing the fear that this technology will take over their jobs. That’s not going to happen for a long time. If you want to use machine learning in applications that are related to safety, like automotive, one key component that’s missing is these systems do not explain themselves. There is no reasoning that you can derive from a network that has been trained about why it does what it does, or when does it fail. There is lots of research going on right now in this area to make these systems more robust and to find a way to verify them. But it has to come with a good understanding of the statistical nature of what you’re dealing with. Applying machine learning is not easy. You need a lot more than a deep learning algorithm. There are other ideas around vision learning and new technologies that make it easier to explain how these things work. This is one of the biggest differences with classical engineering, where you always had an engineer in the loop to explain why something works.

Rowen: They were often wrong, but you could ask them.

Brinkmann: Yes, but you could ask them and challenge them. There’s no way to ask a neural network to explain something. It will not tell you.

Aitken: There’s a theoretical piece to that, as well. The practice of machine learning is becoming increasingly well understood. The theory behind why it works is lagging.

Soheili: Still, putting aside both security and privacy issues, this is an incredible opportunity. If you’re running a data center, you see total cost of ownership reduction that is monumental. There are new products coming out. And just as you can’t imagine our lives today without a GPS or a cell phone, 10 years from today we won’t be able to imagine our lives without help in making daily decisions. It will be an integral part of everybody’s daily life. All of the companies that are involved, from big data centers going after this full force for differentiation, better product delivery and cost reduction, all the way to new applications that will enhance our lives, to the semiconductor folks who will be gaining a software or silicon piece, or both—we all have something to gain by staying on top of this and contributing to the rapid rate of innovation. There’s a lot of ambiguity around where this is going and how fast it’s going to get there, but we’ll figure that out over time. There are so many academics or large corporations with big budgets putting emphasis on solving issues in machine learning. There certainly is a lot of hype, but the hype is there for a good reason. The forecasts call for $60 billion to $100 billion in extra semiconductor sales in 5 or 10 years. Nothing else I know of today will have that kind of an incremental, powerful impact on the daily lives of everyone in the semiconductor business.

SE: And it’s a horizontal technology, too, right? It affects multiple vertical markets.

Soheili: Yes, it’s very horizontal. It has legs in just about everything.

Rowen: Going back to an earlier part of this discussion, why does a neural network make a decision? More than in other kinds of programs, you can do sensitivity analysis. If there are 4,000 inputs, if I change each one, how would it change the outcome. You can work backward to figure out what is the minimum change in your input that will change the output.

Aitken: But you still have a problem there. You can say that you proved through sensitivity analysis that the reason you didn’t get a loan was because 30 years ago you lived on Elm Street instead of Maple Street. That piece of information explains the decision, but it doesn’t explain why that decision doesn’t make sense.

Rowen: On the surface, knowing the effect of the inputs on the outputs is important. It’s not a black box. There is some inherent transparency to it.

Brinkmann: Neural networks are so accessible that you know every little detail of every node. It’s completely transparent. But you’re still not able to say how it came to a certain conclusion. You can’t probe it.

Rowen: Yes, so you have a hard time generalizing about what it does in some abstract sense. But that’s true of lots of kinds of software. Take any sufficiently large body of software, and how you decide that something happened before is a function of so many different concurrent pieces that it’s difficult to explain.

Brinkmann: But if you ask a mechanical engineer why an axle in a car isn’t going to break, they can refer to a physical analysis and all these simulations and show the axle was designed for this specific purpose. With machine learning, this is becoming very difficult.

Aitken: A good example is machine learning and robotics. Classic control theory of moving robots has been around for a long time. Machine learning actually can do pretty well at solving problems, adapting to them, but the safety issue is a key one. With a drone, if it crashes, it made this decision for this reason. Maybe if you change these values in this table it will work better. If it’s running off a machine learning program, it’s not clear what you could do differently that would keep it from crashing next time. You can change it so it wouldn’t crash in that exact circumstance again, but keeping it from crashing in similar circumstances is harder because you can’t just explain that decision. You have to generalize where that decision came from and how you might account for it.

SE: So how do we debug this? What’s the starting point?

Rowen: The starting point is just to understand that we’re attempting to solve very, very complex problems. Some of the challenge with deep learning comes from the fact that we have high expectations. If someone could write by hand an object recognizer that recognizes 1,000 different objects, it would take a long time to weight those modules. Someone would have to construct some weighting factor mechanism. Their ability to explain the 1 million lines of code that would result from all of those decisions still leaves you a very complex thing, and that’s only partly because of the nature of machine learning and deep learning. It’s partly the absolute level of complexity of the problems that we’re dealing with. It’s somewhat unfair.

Brinkmann: It depends on the context of the technology. When it comes to debugging, what you really need to do is look at the data that you used to train it. Analyzing the data set is just the key to debugging in this case. You need to analyze bias and statistical distributions. You can formulate sensible expectations from what a network should and can do from this data and what it cannot do.

Aitken: You will wind up running different kinds of networks to keep track of more intermediate results and have more explanatory power. But also, we’re going to have to change our methods because our engineering intuition is built around old ways of doing things and old algorithms. As these new tools start being used in more and more circumstances, the explanation is going to be different. It’s no longer, ‘This particular function you’ve been involved with ends up looking like that, and that’s the way it always is and it’s what we should expect.’ Right now, very few people have the intuition to do inverse matrix multiplies in their head.

Rowen: One of the fundamental challenges is that algorithm designers, programmers and hardware developers make thousands of implicit assumptions of what the problem really looks like. They rarely have the opportunity to review those assumptions once the system has been built. One of the lessons that comes into focus is that you may build a system for the first time and even deploy it, but you’d better be measuring the distribution of data the system is seeing in the field, and actively comparing that to the distribution of data it was trained for. Is it reasonable this system will be able to handle the real-world data?

Aitken: Or can it be simultaneously trained and updated?

Rowen: At a minimum, you need to know whether your assumptions are fulfilled.

Brinkmann: One component we will see, that hasn’t been there before, is that debugging and verification will be continuous throughout the lifetime of a system. People will put it into different contexts. In two or three years, you may see a different distribution of data coming at your device that you’d better get a feeling for, so you can tell people this is not safe or that this is something we need to upgrade. The whole lifetime of the system will be about getting data from the field back to the factory and matching that up with whatever you have trained it on, so you can prove it or make sure it’s not getting out of the scope of what it’s supposed to do.

Soheili: Every level of AI will be consistent with the commercial opportunity it provides. Before these very complex problems are solved, like a drone that relies completely on an AI system, or autonomous cars driving down the road and avoiding people, lots of easier problems will be solved. Debugging tools will go along with that. Debugging tools and the complexity of what’s being deployed will evolve at the same rate. That will enable a drone to fly by itself without any supervision. Baby steps in this progression are essential. You can think about debugging as a very complex issue that we may not be able to get our arms around, but for much simpler problems it’s okay to make mistakes and learn from those mistakes to make the debugging tools more sophisticated or more relevant.

Rowen: There’s probably a reasonable hierarchy of problems, ranging from, ‘If the system gets it wrong, it inconveniences somebody,’ up to, ‘If the system gets it wrong, someone loses money or their even their life.’ These are going to happen in different places and in different industries, but there are lots of places in user interfaces where it is an aid to communication rather than a substitute for a human in a safety-critical issue. This expectation of reasonableness is much milder.

Fig. 1: The complicated world of machine learning algorithms. Source: UCLA/Feng Shi

Related Stories
When AI Goes Awry
So far there are no tools and no clear methodology to eliminating bugs. That would require understanding what an AI bug actually is.
Applying Machine Learning To Chips
Goal is to improve quality while reducing time to revenue, but it’s not always so clear-cut.
Tech Talk: Applying Machine Learning
How to use AI, deep learning and machine learning across a variety of applications.
Deep Learning Spreads
Better tools, more compute power, and more efficient algorithms are pushing this technology into the mainstream.
Machine Learning’s Growing Divide
Is the industry heading toward another hardware/software divide in machine learning? Both sides have different objectives.
EDA Challenges Machine Learning
Many tasks in EDA could be perfect targets for machine learning, except for the lack of training data. What might change to fix that?

Leave a Reply

(Note: This name will be displayed publicly)