The Role Of EDA In AI

Experts at the Table, part 3: Which aspects of AI implementation should EDA create tools for?


Semiconductor Engineering sat down to discuss the role that EDA has in automating artificial intelligence and machine learning with Doug Letcher, president and CEO of Metrics; Daniel Hansson, CEO of Verifyter; Harry Foster, chief scientist verification for Mentor, a Siemens Business; Larry Melling, product management director for Cadence; Manish Pandey, Synopsys fellow; and Raik Brinkmann, CEO of OneSpin Solutions. What follows are excerpts of that conversation. Part one can be found here. Part two is here.

SE: What is the role of EDA in the economics of AI?

Melling: It is around the dataset. That is where value will be developed. We believe in data-driven processes. We have to get to data-driven approaches in verification first and expand our way of thinking to go beyond the coverage details and start to bring in dispersed datasets and be able to utilize those to accomplish the end product goal. Verification algorithms will change because we will have more data that we can bring to bear on it, and that data will give us new approaches.

Foster: We are not throwing out what we have because we still have hardware. I agree that we need a lot more insight into what is going on in the training process. We don’t have solutions in that space, but there are opportunities.

Brinkmann: The question is, ‘Is it a general opportunity?’

Melling: It is an opportunity space.

Foster: It is an automation opportunity. I agree with that. But EDA is being redefined. If you see what is happening in EDA, it is no longer just transistors, like it was when I started. It is moving to systems, and software is becoming a big part of that. We are redefining what software automation means.

Brinkmann: One key piece will be to consider all aspects of how an algorithm goes into an application. That means making sure the platform is trustworthy in multiple ways. You have to trust that the function can be mapped from the software stack, through the various abstraction levels down to the hardware, and that the platform is working as you want it. This is a huge verification problem. Also, is it secure? Can someone tamper with the data on the way there? Can anyone insert malicious code into the platform or into the software.

Foster: Or in the dataset.

Brinkmann: It is part of creating trust in the solution. EDA has to tackle the areas of uncertainty so that you can focus on the dataset. You have to be able to trust the process and the platform. There are many problems to solve here. Previously, we had equivalency checking. It is the same kind of thing—RTL to gates, then gates to place-and-route. Now we have TensorFlow to whatever abstraction, to an implementation—multiple layers that have to be considered so that you know that what you put in is what you get out.

Foster: It is a huge verification problem, but the point is that we need to lose the E in EDA.

Pandey: It is more about systems.

Letcher: We have two discussions here. One is about applying machine learning techniques to EDA, even if the final device was not a machine-learning chip. That is an interesting part of the discussion. That is how we collect relevant verification data, how we make better use of that data, and software has this trend of becoming more machine learning-oriented in the future. That will apply to EDA software, as well. So the question still remains as to what the killer applications for it in the EDA space will be, but I am sure we are all exploring some.

Foster: A lot of people tend to use the ‘in and around.’ It is fairly fundamental that a lot of people think of the killer application at the end of machine learning. But equally important is the ‘around,’ where I have multiple applications, and mining that, making sense and optimizing the overall flow, that is important, too.

Hansson: That is a huge space of improving the EDA tools themselves using ML. Coming back to whether the final goal for the customer at the top level is EDA, I don’t think so. There is a lot that DA can do, but the very top is running too fast and there is plenty to do before that. We will not be unemployed.

Brinkmann: I like the idea of raising the notion of DA in some ways. It could mean a lot of things, but it could mean providing a trustworthy system that can be used to implement algorithms.

Pandey: Even if you look at traditional SoC verification, it is no longer just about the chip verification. You have to get the RTL into a reasonable shape and then start booting an operating system. If you want to build an AI system for a self driving car, you must be verifying the whole system. The question for DA is, ‘What kinds of platforms and tools can we provide that can help to automate or speed up the development?’ We may not know why an algorithm works, but we need to provide the right support and the right datasets integrated into it. You may then do some tuning or find a way to do it with less power.

Hansson: One aspect is that it is all about data when we talk about machine learning. At the top level, it will be driven by domain experts. The different domains will be more different than they used to be. That filters down – the higher up, the less general purpose.

Brinkmann: I am not sure. The data does not change the algorithms do. The question is what will people work with. Will TensorFlow change because of the application?

Hansson: Not at that level, but if you work on predicting where bugs will be found, or driving a car, those domains are very different. When it comes to verifying those systems, it is so different that they have very little in common even though they both use TensorFlow.

Brinkmann: What happens when you come to a hardware platform?

Hansson: Then it looks the same.

Brinkmann: And you have the same concepts.

Hansson: The closer you get to hardware the more deterministic it becomes. Hardware has an interface that should work. You suffer more problems the higher up you go. Verification wise, they look very different.

Letcher: That is no different than any traditional application. If you look at image processing, the algorithm is a given. There may be artifacts in the output images, and some DSP expert has decided what looks good and what doesn’t. This is never considered at the RTL verification layer. Questions about the algorithm is the top level, and EDA would never work at that level. There are just too many levels of abstraction in between. We could move up a couple as suggested, but not all the way up.

Melling: It was mentioned that people want to get their operating system booted as soon as possible, and this is all part of Shift Left. That is true, but it also creates problems. When running at the operating system, application level and debugging the problem, if it is truly a hardware problem down in the interconnect or coherency, it becomes a nightmare because you cannot trace through all of the levels of the complex software. That is one of the reasons why Portable Stimulus was put together. We have to have ways to describe complex use cases but not bury them inside a complete operating system and application. This is what I expect the hardware to do, and to create use case and very directed testing so that when it fails you can figure out what is going on.

Pandey: You can run all kinds of complex applications, but without engineering the right hooks and mechanisms to trace back – then hopefully they have less overhead so that you can trace it. But there are huge productivity gains by saying, ‘I have this possibility.’

Melling: Hit it with a hammer – absolutely. You will still do that.

Pandey: It is a question of how we learn from use cases and follow good engineering practices. Today, if you want to verify a complex piece of software, it doesn’t make sense to take an algorithm and map it down into hardware to make the problem manageable. Instead, we create an abstraction level. We should not try and take every AI problem or every system-level problem and see how to verify it at the hardware level. Some things will make sense, others don’t, and the boundaries may shift over time. But we have to do things like integrated hardware security. It should be guaranteed. That we do the operations right should be guaranteed. Algorithm development will proceed on a different track. We will figure out the right levels of abstraction and not try and hit everything with that one hammer.

SE: It would appear that EDA is trying to move up in abstraction. We talked about the hardware and the mapping from the tensors to the hardware. Are we saying that is in the realm of EDA and we need to be managing and potentially proving that what came out of tensor flow is actually what is implemented in hardware?

Letcher: It is a little different. The output of the TensorFlow compiler is what the hardware has to deal with.

SE: There is a mapping from that to the hardware architecture – just like with an FPGA.

Pandey: If you have TensorFlow going to a GPU, it is within TensorFlow to create the CUDA instructions sequences that represents the graph in TensorFlow. Verifying that is more of a compiler correctness problem, but that is not to say that you could have a tool that enables you to synthesize that into a new piece of hardware.

Brinkmann: Why does it make a difference if you map that to an FPGA or some other programmable logic or TPU?

Foster: Or even a standard CPU, even though it may be inefficient.

Pandey: There is a very big difference. If you have a GPU in every machine, you don’t map the TensorFlow onto the same thing when you have an FPGA system. You will have one set of procedure calls for CUDA versus whatever your library is for FPGA.

Brinkmann: As a user, I don’t care. I have a software view and I want to map it to some architecture.

Pandey: Maybe as a user, but from an EDA perspective, we are engineers and we want to build something.

Brinkmann: If you look at the space of the platforms that exist, FPGA companies are not considered EDA companies, but they are providing those synthesis tools and compilers the same way that Nvidia provides their CUDA compiler. These are very similar things. In one case we can provide some solutions to them, so why could we not do the same thing for CUDA code generation? There may be a boundary where things are possible and some that we cannot do, but I don’t see why they are that different. If you synthesize to a specific architecture, it should not make a huge difference.

Pandey: Maybe as an end-user application developer you don’t care, but it vitally matters if we are in the system design automation industry. You have to look under the hood and you have this application that gets mapped onto something that is hardware. I can’t conceive of a reasonable verification system where the awareness of that separation does not exist. You have to know. If you are mapping TensorFlow onto an FPGA, what are the primitives that it can run on? What are you passing to the FPGA? Even if you implement the system on an FPGA you will have the notion of instructions – maybe passing matrixes, what is the layout, the mapping etc. When you map to CUDA you have to see what is inside and map it to the CUDA calls. For the CPU you have to know what it gets mapped onto.

Letcher: To use an analogy, you can take C++ tasks and use it to verify a traditional processor, but you still have to know the architecture of the processor. You may not have to write your instructions at an x86 level if you are verifying an x86 processor, but you have to know it is an x86 processor when you are writing a C++ testcase. The equivalent question for machine learning is, ‘Do you write the testcase at a TensorFlow level or do you write it at CUDA or something else level?’ You have to be aware of the ultimate target hardware architecture, but you still probably want to write the testcase at an abstract level.

Pandey: High-level testcases would be there, but you would also have unit tests that run on that level. If you have TPUs running for machine learning, the programming interface which enables you to transfer data from the TPU and get it back, those will certainly be different.

Foster: I don’t see it as being any different from the way we verify systems today. We start at a unit level, and it is totally different from what we do at an application level. We still have to do both.

Related Stories
AI Knowledge Center
AI top stories, videos, white papers, and blogs
From AI Algorithm To Implementation
The transformation from algorithm to implementation has some significant problems that are not being properly addressed today.
The Automation Of AI
Will the separation of hardware and software for AI cause problems and how will hardware platforms for AI influence algorithm development?
Memory Tradeoffs Intensify In AI, Automotive Applications
Why choosing memories and architecting them into systems is becoming much more difficult.
Using Analog For AI
Can mixed-signal architectures boost artificial intelligence performance using less power?

Leave a Reply

(Note: This name will be displayed publicly)