Experts at the Table, part 1: Will the separation of hardware and software for AI cause problems and how will hardware platforms for AI influence algorithm development?
Semiconductor Engineering sat down to discuss the role that EDA has in automating artificial intelligence and machine learning with Doug Letcher, president and CEO of Metrics; Daniel Hansson, CEO of Verifyter; Harry Foster, chief scientist verification for Mentor, a Siemens Business; Larry Melling, product management director for Cadence; Manish Pandey, Synopsys fellow; and Raik Brinkmann, CEO of OneSpin Solutions. What follows are excerpts of that conversation.
SE: The EDA industry has struggled with the separation of hardware and software. While it is now trying to bring certain aspects of software back into the fold, the industry still struggles with system-level problems that are combinations of hardware and software. With machine learning and AI, the industry appears to be making the same choice to allow hardware and software to separate, which is potentially the same mistake.
Foster: I don’t think the separation of hardware and software was a mistake, and I don’t think that is the biggest issue right now. We are moving from Turing architectures to statistical architectures. That is the real challenge. It is not hardware/software separation. Consider high-end servers. There is a clear separation with very good concurrent processors to do the development, and it is done this way to facilitate verification and development. For example, there are APIs between the hardware and software worlds so development can progress independently and concurrently. I do see the point in terms of optimization, but I don’t think that is the biggest issue.
Pandey: The separation between hardware and software has always existed. Much as we like to do hardware/software codesign, in reality, if you want to build a high-end transaction system, such as a database system, it does not make sense to start verifying whole end-to-end stacks. We do not have the capability to do that. We have a separation of abstraction; we verify the ISA, maybe verify the compilers, but verifying that a transaction goes through the database—that is too low a level of abstraction to be verifying that at the ISA-level. Taking the same example when you look at machine-learning applications, and you look at the quintessential application—self-driving cars—you have the sensors, you do sensor fusion, you try and recognize objects using CNNs. That level of verification—how do you even capture that? The best that we can do today, and that is a pretty reasonable separation, is that [when] you verify the neural-network processor, you make sure the individual operations work. The behavior of the system as an aggregate is a very different thing—we can’t even characterize it well. That is an on-going topic of research. How do you even explain what is happening, so if you can’t explain it, there is no way you can write an assertion to verify it. It would be nice to have a technique, but it does not exist today. You cannot verify an AI system end-to-end.
Foster: You stated that we have problems even describing it—a deterministic system means that you put A in and you expect Y. That is not true in an artificial neural network. You put A in, you have a probability of getting Y. Is that a bug in the design or is it outside of the probability?
Pandey: Actually, these systems are deterministic. Given an example or an image, if the system is operating correctly, it will always provide the same answer, but you are right in that the answer is based on a correctness probability of it being a cat or a pedestrian crossing the road, and it might be 99% probability—but the hardware is deterministic. It is just that the results may not be right.
Foster: When you get down to a simple neural network, it is just an array of MACs. Those are easy to verify, and we have been doing that for a long time.
Hansson: The key with AI is that it is statistics when we talk about the system-level results. You need a deterministic framework to verify the outcome because sometimes it will do some something smart and at other times it will be stupid. You need the deterministic framework to know if you should make a U-turn while driving at high speed. You can verify the framework, but you cannot verify the outcome of every result and expect it to be within certain ranges.
Pandey: These systems are deterministic, but it is that real-world situations that are different all the time. If your car is deciding if it should do a lane change, given the exact set of inputs, the hardware and software will do exactly the same thing, except there are so many variations.
Hansson: You make a deterministic piece of hardware, but within the scenario there may be input stimulus that has never been seen and as a result you may get random output.
Letcher: You are verifying a general-purpose processor. The training that is done on that is a separate thing. Until it has been trained you cannot verify what it will do in an application space. Just like any processor, the state space to verify all applications is huge. So the question is, ‘Do you verify it for a specific application after training or do you verify it as a processor that can be trained to do anything?’
Brinkmann: There are two drivers for the two cycles. One is the software side that has the algorithm as a driver. Algorithmic innovation is progressing faster than Moore’s Law by a large margin on the machine learning side. So advancements in algorithms are more relevant at the moment than the silicon. The other side is the economics. The hardware platform is something that you do not want to change for each application individually. You need to reuse that, or you will not get enough leverage from the software. If you have a platform that is versatile enough, such that you can map different algorithms to it, then that is what people are trying to build today with programmable logic, processors, accelerators, etc. You want multiple versions of algorithms mapped to that, and leverage the innovation in software. When you look at verification and the separation—you have to verify the platform and that has multiple aspects. Is the fabric of the FPGA properly built? Is the processor doing its job? You may not have the application in mind when you do that. You verify the characteristics of the platform so that you can provide some guarantees, perhaps just statistical in the future if you think about analog blocks being used for machine learning. But still, they are guarantees about the platform. Second, you need to verify the algorithm and the application. Now you verify if you get a cat or a pedestrian. Third, you verify if the mapping to the platform works. If it goes onto an FPGA, is the generated RTL mapped properly to the FPGA? These are different lifecycles that are interlocked. The verification on the algorithmic side will be repeated for multiple versions of the algorithm based on new data from the field, and you have to repeat the mapping process and the verification while the platform remains unchanged. That is why the platform has to be robust. So the separation is actually getting stronger.
Melling: One of the great comparisons for AI is what has happened in GPUs. If you look at it, the GPU had the same problem. It was not an absolute answer. It was up to the visual judgment of what was being represented on the screen as to whether it was smooth enough, the right shading, etc. AI has the same problem. It is the microcode that is to be put on the hardware that will produce the system that we can ask, ‘Is it giving me the right experience? Am I getting the expected answers?’ With a GPU you had a golden eye—the people who were the aesthetic judges. We do not have that with AI because the algorithms are so diverse. We may have to borrow some verification technology and move it into the software world, where they are doing algorithms and try to work out how to randomly expand their test sets or their training sets in order to get higher-quality algorithms. At the end it will be statistics that judge the outcome and if we have achieved the desired results.
Hansson: It is not that different from random testing really. Random tests are a subset of an enormous space. Machine learning also can be validated as a subset of an enormous space.
Melling: We have all seen the example of someone putting two pieces of tape on a stop sign and it is no longer recognized.
Brinkmann: There are many new verification challenges for machine learning, such as characterizing the data set, looking for outliers, looking for bias. Being statistical is one thing, but understanding the space well enough to make some good predictions about how well it will perform in uncertain situations is a big challenge, and that is where we need a new type of verification.
Foster: And we do not have good metrics in that area. The whole notion of coverage is being debated today both from an application and algorithmic point of view.
SE: In the hardware/software world, software quickly became king. Hardware could not change in any way that breaks software. As a result, we have been stuck with existing ISAs purely because of existing software. With AI, we started by using existing ISAs, but they did not have enough compute power, so we migrated to GPUs. The algorithms that are being developed for AI are now being heavily influenced by GPUs. However, we also realize that GPUs do not have enough performance and consume too much power. Algorithms appear to be saying we do not have the right platforms for either inferencing or training. This looks like an unstable situation. Who will make the breakthrough—hardware or software?
Brinkmann: Would you disagree if I said the new ISA is actually Caffe and TensorFlow? The abstraction has been virtualized. You are not looking at processor abstraction anymore. When talking about machine learning, you are talking mapping a Caffe or Tensorflow algorithm to the hardware that is available. It doesn’t care about the ISA or if it is a GPU or FPGA. You have abstracted away from that. But you have to agree on something. Otherwise, you will not manage to use the same software on different platforms. The interface becomes virtual.
Letcher: It started out with things like CUDA and OpenCL—lower level standards, but they have been replaced largely by TensorFlow or an equivalent higher-level interface.
Pandey: The language of today—Tensor—has risen quickly. Some talk about a different language being required for AI, but the fundamental unit that we deal with are these tensors. Multi-dimensional matrices of numbers. We had the question of software being king, and now we are in an unstable situation that is driven by economics. Ten or fifteen years ago, you had processor performance increasing from technology improvements and micro-architectural advances. When you can jump 50% a year without needing to do anything in the software, then conventional applications will not change—that is just a matter of economics. But the situation today is different. First, the core performance is not increasing fast enough for machine-learning applications. You have to think about new micro-architectural techniques. Nvidia’s current processor is doing whole matrix multiplication. These tensor units in the GPUs—everyone is coming up with different types of neural-network processors—these are all about doing large tensor operations efficiently. We are at a point where, when you think about applications, think of what the hardware architecture is going to be. I suspect that with so many startups working in this field, that there is no place in the market over the long run for 30 or 40 companies. The numbers will whittle down. Probably, as with the markets in general, we will settle on standard forms, languages, micro-architectures. But we are a long way from that today. That is what makes the area exciting right now.
Foster: I suspect it will grow before it shrinks. At DAC this year there have been 92 papers submitted on the subject of machine-learning architectures. This is up 64% from last year. A tremendous amount of research is going on in this area, and we will see a lot more startups until it all collapses.
Hansson: AI is extremely driven by software. This means that we have to improve the hardware, but it is extremely driven by software—even culturally. If you look at AI/ML, everything going on is open source. It is downloadable, and you can play around with it. There are no expensive EDA companies involved in AI algorithms. It is very much an open environment. The hardware world tries to provide hardware that will fit into that, but that is where the drivers are.
Melling: We are trying to provide fast-enough, low-power enough deployment platforms. We also have to distinguish the two sides of it—training and inferencing. Training, which is very hefty, compute-intensive, and inference and deployment, which are much more about low-power. It still has to have the performance to get the necessary response times. Those two types of systems are going to be different. The tradeoffs are different. People are looking at new memory architectures and how to implement in a single-bit algorithm. These are all things that are being looked at so that algorithms can be deployed in a low-power and cost-effective manner.
Pandey: Standards are slowly emerging. You have the problem of creating a model, then using it somewhere else, such as ONNX. (ONNX is an open format to represent deep-learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them.) Not everyone supports it yet, but such standards for model interchange are emerging. It is not clear what the final standard will be, but if we talk about data models being the currency of machine learning, then this is an area that needs to be standardized. It is very much a software-driven culture, but there are two aspects to it: during development of the model, and the deployment for inferencing, which gets a lot closer to the hardware because there is almost an equivalent of microcode.
Hansson: And maybe that will get standardized in the future. There isn’t room for 40 different chips.
Pandey: The standard for the models will help with this. It shows how to represent your values and the data. From my observations, the frameworks are open source. What is being closely guarded is the dataset. That is a very important component, and when you think about a verification framework, it is the owner of the dataset who has the power. So even when we know the API, we do not know the dataset or potentially the application.
Melling: That is where EDA comes in—the automation and expansion of the datasets, the improvement of the datasets. Some of the lessons learned from verification, and how to apply those to that problem, are going to be necessary to accelerate the quality of the algorithms.
Brinkmann: But the question remains—who is going to set the standards or expectations on this? So far, people in regulatory committees are still far away from perceiving it as a problem for verification. The dataset—that is where verification needs to go and that is where the real problems are.
Hansson: In academia, in some areas, they upload on Github datasets so they can do research on those and measure their algorithms, but that is not used for verification.
Foster: There is a lot of concern about bias in the datasets, and there is also aging of the dataset where we constantly have to drop data because the world evolves. This also has to be factored into the verification of the dataset.
Leave a Reply