Experts at the Table, part 1: Will machine learning and AI improve chip manufacturing?
Semiconductor Engineering sat down to discuss artificial intelligence (AI), machine learning, and chip and photomask manufacturing technologies with Aki Fujimura, chief executive of D2S; Jerry Chen, business and ecosystem development manager at Nvidia; Noriaki Nakayamada, senior technologist at NuFlare; and Mikael Wahlsten, director and product area manager at Mycronic. What follows are excerpts of that conversation.
(L-R) Noriaki Nakayamada, senior technologist at NuFlare; Aki Fujimura, chief executive of D2S; Mikael Wahlsten, director and product area manager at Mycronic; and Jerry Chen, business and ecosystem development manager at Nvidia.
SE: What are the general challenges in computing from your perspective and why all the interest in machine and deep learning?
Fujimura: The tradeoff between accuracy and the speed of execution for computing tasks has always been a challenge in Moore’s Law. And it will always be like that for as long as Moore’s Law lasts. It’s always better, of course, to be faster and always better to be more accurate, but you can’t have both as it turns out. You have to make tradeoffs. Deep learning is great because it actually gives you an opportunity to do things that are more accurate faster. So it’s one of those breakthrough opportunities for people who are trying to supply software for the industry.
Chen: Traditionally, a lot of the decision making for any type of operation has been some sort of physics-based understanding. You have some kind of physics model. Based on that physics model, you use that to tell you what you should do, make some kind of prediction about what’s going to happen, or what you should do to optimize your process. That’s still a very powerful thing. And there’s a lot of compute capability and a lot of HPC (high-performance computing) that’s been thrown at that problem. This big sea of change that we’ve seen more recently is that you can augment a lot of the physics-based models that have always worked. Except now, you can augment that with machine learning, or in particular, deep learning. Basically, it’s a data-driven model that complements the physics-based models. The evidence has shown in multiple use cases in many industries, where that complementary integration between a physics-based model and a data-driven model has been able to achieve amazing results.
SE: Why can’t we just continue with the traditional architectures or the HPC path?
Chen: In my own background, I was in structural mechanics. So that’s HPC for physics-based structural simulations. And then, if you look at Nvidia’s history in computing, and put the graphics portion aside, we started in the HPC space for supercomputing, essentially for physics-based simulation. There is a lot of great value that you get in physics-based simulation. The timing is such that, because these breakthroughs have happened in different areas, these data-driven models like deep learning and other things have now become so powerful, effective and useful. It has actually improved on the results. And sometimes, you have some problems with the physics. Your understanding of the physics actually starts to break down. Or, you understand it, but the computational complexity in order to create the simulation is so expensive. You can shortcut some of that by using some of these data-driven models to supplement it. Deep learning is this hugely useful tool and a big hammer that people are successfully using.
SE: Machine learning has been around for years. From what I understand, machine learning makes use of a neural network in a system. In neural networks, the system crunches data and identifies patterns. It matches certain patterns and learns which of those attributes are important. Is that correct? And what’s different between machine learning today versus the past?
Chen: It’s actually not magical. People tend to throw out a lot of terms. Machine learning continues to be very useful for lots of things. Many of what I call the traditional machine learning techniques typically requires some PhD thesis to design or to engineer a bunch of features that you use and pull out of the data. And based on those features, you have some kind of representation of the situation of the physics. And once you have that representation with these engineered features, then you can make some kind of prediction or some sort of decision. Today, in my way of thinking, the reason why deep learning has proven to be so powerful is that it’s using a neural network. It’s a type of deep neural network, where it will automatically learn the features. It not just learns any features. It’s learns better features that are effective and more of them than a thousand PhDs could ever discover in their academic research lifetime. It’s learning the representations, as opposed to engineering the representations.
Fig. 1: A simple three-layered neural network, comprised of a input layer, a hidden layer and an output layer. Source: Aberystwyth University, Lancaster University
Wahlsten: That’s part of why we are interested in it. We think this could be a technology to enhance our understanding around our tools. For example, in our pattern generation systems, the classical physics is that you have a baseline. In the tool, you can understand most things by applying pure physics. But there are some small things going on that you don’t understand. We think we can use this kind of technology to figure out how things are connected. Another thing we see is that you can lump in a big portion of data and see what’s coming out. Compared to a human in this case, the machine could be more creative. Typically, if you try something on your own, you are more bias, or you need knowledge, or you are trying to find something that supports what you already know. Here, you can just throw in a lot of big data sets and try to figure out how it’s really connected. We see a big potential to both make our tools better and also trying to understand and see the early signals if something is wrong from your perspective in the results. Also, we are developing systems in very different environments. This could be a potential to create more dynamic software, compared to what you have today. Today, it’s very static. Here, this could be tuned to make a tool more dynamic for customers without putting all of that complexity into the code.
SE: In semiconductor manufacturing, leading-edge chipmakers are moving to the next nodes, which are more complex and expensive. With that in mind, photomask manufacturing has become more complex and challenging as well, right?
Fujimura: The recent eBeam Initiative survey showed once again that mask turnaround times are only getting longer. It makes sense. What the mask has to do to make wafer lithography, or for that matter, flat-panel display lithography, work well. So the mask must contain features to make that lithography happen. But as you go down to finer and finer geometries, you are going to need more tricks. And many of those tricks are about making the mask shapes more complex. It makes the mask-making task more and more complex.
SE: What are the implications for machine learning or AI in photomask manufacturing? Will AI and/or machine learning enable an e-beam mask writer to pattern the photomask by itself some day?
Nakayamada: That’s a future goal. But there are many, many things that need to be done to reach that. There are applications, in my mind, in the short term. We see this in certain areas in the whole mask writing process. It’s in the optimization of process control.
Wahlsten: Primarily, this technology will improve the quality. We should be able to make better mask writers and also service our customers with better control and help them with what’s going on there. Then, you have preventative maintenance. The tool by itself can’t understand if something goes in the wrong direction. But with this technology, you can be there before something breaks down. So that’s also valuable for the customer.
Fujimura: That’s true in all industries. When service is called and they go to a site, they can’t carry every part. But if there is a deep-learning–enabled mechanism, the probability of preventative maintenance improves.
SE: Where is machine and deep learning heading? What are the trends here?
Fujimura: Deep learning scales. That’s the secret. It’s not magic. It’s only limitation for scaling is how many GPUs you have and how much computing power you have. Deep learning has a computing model that’s inherently scalable. Once you teach a deep learning engine something, and as the hardware scales up, the capability is going to expand, maybe exponentially. This is why we are seeing so much progress and so quickly in applications like autonomous driving or whatever. GPU acceleration is the key to enable deep learning. Of course, the idea for deep learning has been around for 40 years. The mechanism was already there, but the computing capability wasn’t there. Then, GPUs came along and they are inherently scalable. This is why GPU computing power continues to expand, because it’s about bit width. It’s about how many bits you compute at the same time. It’s not about how many gigahertz. So in general, machine learning involves iterative optimization, which is a class of computing. Whenever you have that kind of a problem, and whenever somebody in computer science says that they have an iterative optimization problem, that problem can be speeded up with deep learning. It doesn’t matter what it is. I don’t even know what the application has to be.
Related reading: