IBM discusses AI, neural nets and quantum computing.
At the recent IEDM conference, Jeff Welser, vice president and lab director at IBM Research Almaden, sat down to discuss artificial intelligence, machine learning, quantum computing and supercomputing with Semiconductor Engineering. Here are excerpts of that conversation.
SE: Where is high-end computing going?
Welser: We are seeing lots of different systems start to come up. First of all, the HPC (high-performance computing), or the classical pushing of performance for computations, is going to continue. Petascale systems are already there, looking towards exoscale. We are really trying to see how you can continue to build large von Neumann architectures for doing really big, fast problems. This is also doing things like transaction processing. We will continue to push that. But the normal scaling that we’ve been doing on Moore’s Law has become more and more challenging and these other paradigms have emerged.
SE: What else is happening?
Welser: We are seeing an increasing workload now in what we call artificial intelligence or AI, which is largely around neural nets or machine learning sorts of technologies. Those are very different than what these HPC systems are set up to handle. They don’t necessarily need high-precision, extremely fast floating-point calculations. What they do need is the ability to analyze a lot of data and then run it through many different cycles to train neural nets, for example. Then, when you get the results out, you expect answers not like something that’s an exact figure. What you expect is to recognize a car moving down a street, or to recognize reading text like the brain does. That’s a different kind of problem. You can do it on normal computers. That’s been done. But when you started to get GPUs, they turned out to be much better at doing those calculations you need for neural nets.
SE: In a neural network, a system crunches the data and identifies patterns. Over time, it learns which of those attributes are important. This isn’t new, right?
Welser: Neural nets are not new. They’ve been around since at least the 1970s and 1980s. The industry really didn’t have much success with them until recently. That has come from two things. One is the sheer prevalence of data we have now in digital form that is labeled. You have enough data to train a neural net now. And two, we are able to build a large enough neural net now. That’s really a function of having enough compute power. Specifically, the GPU architecture turned out be quite useful for really running large neural nets.
SE: What is IBM doing here?
Welser: The POWER9 system is coming out from IBM. It is set up for HPC. But from the bottom up, it’s also built to run AI workloads. So it has a mixture of GPUs and our POWER processors in it and huge numbers of them, tens of thousands of these are built into a large system. People who want large HPC systems also see a need to have mixed into that the large GPU systems to run the AI workloads in concert with what they are doing in the more traditional space.
SE: Let’s talk about quantum computing. In classical computing, the information is stored in bits, which can be either a “0” or “1”. In quantum computing, information is stored in quantum bits, or qubits, which can exist as a “0” or “1” or a combination of both. The superposition state enables a quantum computer to perform millions of calculations at once. IBM is developing quantum computers, right?
Welser: We have a 16 qubit system now that we have on the cloud that anyone can go out and use. We’ve announced that we will have 20 qubits commercially next year and 50 qubits coming a year or so after that. It’s really for very specific problems. Most people when they hear about it, they think about cryptography or somehow we are going to break all encryptions. Certainly, if we had millions of qubits, that could be something you can do. But long before we get there, we’re finding you can use these systems for doing things like simulating quantum chemistry or simulating materials kinds of processes. Basically, you are simulating things that are fundamentally quantum themselves. It looks like you can do that with a fairly modest number of qubits. Qubits are fairly error prone, however. It’s a challenge to keep qubits coherent.
Figure 1. A) Schematic of 20 qubit system, and B) 50 qubit system illustrating qubit interconnectivity. This complex interconnect fabric permits maximum flexibility for IBM Q systems. The 50 qubit is the natural extension of the 20 qubit architecture. C) Photograph of the quantum processor package for the first IBM Q systems. The processor features improvements in superconducting qubit design, connectivity and packaging. (Credit: IBM)[/caption]
SE: What is Watson?
Welser: Watson is not a single machine. It’s a collection of algorithms and processes. Many of them run on traditional hardware. Some use deep learning. Based on the problem it’s trying to solve, we’ll use a mixture of algorithms and hardware.
SE: Let’s go back to traditional supercomputing. Is that still important and what are the challenges?
Welser: Continuing to push the HPC supercomputer is critical. There are still a lot of problems that require that kind of computation we can’t solve. Predicting weather is one example. It’s an incredibly huge problem. You tend to put a lot of processors in them. So you have a lot of processors, but unless they can all communicate efficiently, you’ll find they really don’t go that much faster than one at a time moving something around.
SE: Any other challenges?
Welser: It’s really getting a balance between the compute power, the memory capability, and moving the data back and forth between the compute function, so that you can keep the computer running at full speed.
SE: Where does machine learning come into play?
Welser: Where the neural nets and what we call AI is most useful is in unstructured data, whether it’s images, text, sensor data or audio data. These are things that are not already easy to put in a database.
SE: Where do you use it?
Welser: You can think of it for many applications. If you want to do image recognition for self-driving cars, you need something you can immediately figure out what’s around you and try to identify it. You may be using it for traffic. You have traffic cameras around. Then, you are trying to figure out how traffic is moving around. So it’s anyplace where you are trying to actually identify patterns and objects. For languages, it’s very good too.
SE: There is no consensus in terms of architectures for these systems. Some are using ASICs, while others use FPGAs, GPUs and MPUs, right?
Welser: We are all experimenting with this. The GPU structure itself has turned out to be extremely useful. You see people putting out the TPU or building their own custom circuits in FPGAs. We are all trying to find the best and efficient structures for this. One thing to differentiate between this is, are you trying to train the system? So do you need a system where you are really going to send a lot of data and updating the weights? Or do you have a fully trained neural net that you have completed, and you just want to go run and then to inference, that is, just identify things.
SE: Not long ago, IBM rolled out TrueNorth, a neuromorphic chip. A single TrueNorth processor consists of 5.4 billion transistors wired together to create an array of 1 million digital neurons that communicate with one another via 256 million electrical synapses. What does TrueNorth do?
Welser: It is set up just to do inferencing. The idea is that we wanted to go extremely low power and fast. And we wanted to do real-time video on a phone or something mobile. So we got rid of the stuff you would need for training, such as a lot of high-precision floating-point. It’s just a spiking neural net. That’s great if you want to go embedded out on the edge. For training, we would use our normal GPU training systems. And then you would download it on the chip to do the inferencing.
SE: Neuromorphic chips are fast pattern-matching engines that process the data in the memory. The first neuromorphic-class chips are based on SRAMs, which have limited programming capabilities. The next wave of neuromorphic chips is moving towards phase-change and ReRAM. Any thoughts here?
Welser: TrueNorth was all digital CMOS. It was emulating spiking. What a lot of the memories you are talking about are basically trying to come up with an analog equivalent. So that’s what a lot of people are looking at right now and asking: What is a good memory for doing that? The feeling is that this could be more efficient than doing a digital implementation, at least for the neuron connections.
SE: Where is AI going?
Welser: The specialized AI, or AI for a specific task, is really working now. What we are really seeing within the industry is people rapidly figuring out how you can apply it in different industry domains. For example, we can use it as a radiology assistant. How can you use it to help a radiologist look at MRI scans? Or how can you use it to help in an IoT space? So you have all of this sensor data coming in from an industrial factory. How do you use that to make predictions and run the factory better? So we are really at the stage of how we continue to exploit the domains that make sense. At the same time, we are looking at ways to advance it.
You make it sound that IBM Research is what it was around 1984. IBM is sitting on some patents about learning. My next book has temporary title, Phase-Time. But I will be busy responding to who takes things seriously.