Fabs Meet Machine Learning

D2S’ CEO sounds off on the impact of deep learning, EUV and other manufacturing advancements.


Aki Fujimura, chief executive of D2S, sat down with Semiconductor Engineering to discuss Moore’s Law and photomask technology. Fujimura also explained how artificial intelligence and machine learning are impacting the IC industry. What follows are excerpts of that conversation.

SE: For some time, you’ve said we need more compute power. So we need faster chips at advanced nodes, but cost and complexity are skyrocketing. What’s the state of Moore’s Law?

Fujimura: Moore’s Law is definitely slowing down, but I’m confident there will be continued innovation everywhere to keep it going for a while. There’s a lot that every discipline of the eco-system is working on to make incremental and breakthrough improvements. For example, I know DFM has been around forever, but I can see that there’s a lot more that can be done. Regardless, there’s no question there is a need for more compute power. There was some talk of how we can no longer use more compute power perhaps 10 years ago. There’s no more talk about that now.

SE: Clearly, the industry is changing, right?

Fujimura: Five or ten years ago, the industry bifurcated into IoT or consumer devices versus high-performance computing. Low-power and things like that sort of dominated one side of the equation. And then, continuing to scale the computing power became the other side of the equation for high-performance computing. So there is a clear trend that GPUs and these massively-parallel architectures are going to be the way high-performance computing is going. It’s not a sudden shift, of course. These things shift over time, but the trend is undeniable.

SE: What about escalating IC design costs at each node?

Fujimura: Design cost is, of course, an important part of the equation. The design community figured out many years ago how to use derivative designs. You have one big design project, and then you leverage the design that comes out of that in many derivative designs. So you can have multiple tape-outs of the same design. That’s the basic underlying design cost. That will continue to happen. It’s definitely a much more limited market than it used to be. You don’t need to have an IoT device that is run in a leading-edge fab. But the trend of using more leading-edge nodes for high performance computing will continue.

SE: Artificial intelligence, machine learning and deep learning are the latest crazes in the industry. How much of this is hype?

Fujimura: ​It’s not hype. It’s not like the Lisp machine craze in the 1980s and early 1990s. (Lisp machines are computers that run Lisp, a computer programming language.) I used the Lisp machine in school and thought it was the greatest thing in the world. But I never thought it would be for the general programming population. What’s happening now is not hype at all. It’s driven by deep learning. Deep learning is a subset of machine learning. And machine learning is a subset of AI. This is real. Deep learning is a major discontinuity, a disruptive technology and a major opportunity.

SE: How is this different?

Fujimura: Deep learning flips programming. Instead of conventional programming, where the programmer writes code to transform a set of inputs to a set of outputs, deep learning takes a bunch of example input/output pairs and learns to pattern-match. The output of deep learning is, in essence, a program that transforms like inputs to like outputs, mimicking the training data set. Unlike even machine learning before it, deep learning does things no software engineer could figure out how to program before. Deep learning enables software applications that could not be possible before. With deep learning, they are possible, even trivial now.

Fig. 1: Convolutional Neural Networks Used For Machine Learning Source: Stanford

SE: Is deep learning being used in semiconductor manufacturing?

Fujimura: It’s happening fast, but it’s in the beginning stages. I don’t think there is any question that deep learning is already impacting the semiconductor manufacturing sector.

SE: At one event ASML’s Brion unit discussed how they could apply machine learning in photomask applications. They explained how it could be used in optical proximity correction (OPC) and inverse lithography technology (ILT). Is that correct?

Fujimura: The OPC example that Brion made famous is talking about using deep learning to accelerate the initial embedding of OPC or ILT. They are reporting that maybe the runtime can be reduced by half. Runtime is one of the most important issues in OPC, so this is significant.

SE: How is it being used?

Fujimura: The basic idea here is to use deep learning’s pattern matching ability to create an initial embedding that is so much better than the alternatives available today. This is so the number of optimization iterations needed to finish the mask design is substantially reduced, resulting in the significant reduction of overall runtime. The Brion paper describes running their OPC/ILT code to take a bunch of input patterns (desired wafer shapes) and produce a bunch of output patterns (mask shapes needed to produce those wafer shapes). Now you give those input-output pairs to a deep learning setup, and it produces a program that transforms like-inputs (other, but similar desired wafer shapes) into like-outputs (mask shapes).

SE: What are the outcomes?

Fujimura: Deep learning is a statistical method. So taking examples from Imagenet competition and the like, you might get 95% accuracy in the result where the output mask shapes produce the desired wafer shapes, resilient to manufacturing variation. In semiconductor manufacturing, of course, 95% isn’t anywhere good enough. We need at least 7-sigma accuracy. So that’s the wisdom of the Brion paper, where they’re using it to accelerate computing. After the deep learning inference engine produces the output mask shapes, those mask shapes are used as the initial embedding into the conventional OPC/ILT program. The conventional program runs a lot faster than starting with nothing, or starting with the wafer shapes (multiplied by the 4X magnification factor), or even with some SRAF generation.

SE: Where is all this heading?

Fujimura: The design community that comes before manufacturing is also going through a deep learning transformation. There are many different disciplines reporting good results using this method of deploying deep learning for generating a high quality initial embedding for accelerating software products.

SE: Where else can you use deep learning?

Fujimura: Automatic defect classification, or ADC, is an important field in inspecting masks and wafers. There is also big data. Obviously, fabs have a lot of data. Correlating or finding correlations between events with massive amounts of data is something that machine learning is good at.

SE: What’s happening in the overall photomask industry?

Fujimura: The overall mask market is finally growing. CAGR has been at 4% in the last three years, and I project we’ll continue to grow for a while now. We’ve been kind of stuck at an inflation adjusted $3 billion for more than a decade, so this is great news for the industry.

SE: Why did the mask industry stagnate?

Fujimura: In the past, there was a reason why it was not growing. A leading-edge mask-set has more masks per design, maybe even 100 masks per design. But there are fewer leading-edge masks. Because the leading edge is so expensive, only a few companies can afford it. So that’s why the overall number has kind of stayed the same. But the mask market is dominated by the non-leading-edge masks in both dollars and numbers. But the leading-edge becomes the high volume node eventually. So what’s happening at the leading-edge is a leading indicator of the future of the mask market.

SE: What’s changed?

Fujimura: What’s finally happening now, in some ways, is the pent-up effect of all of these things. The leading-edge masks are still expensive. But finally, we are at the point where there is a lot of IoT and sophisticated chips being used. That’s one side of it. The other side is the semiconductor market in general, probably buoyed a lot by deep learning and the computing for deep learning. There is a lot more activity in the semiconductor market in general. Specifically, there are more new design starts in the leading-edge space. Then, there is finally the effect of EUV. In the overall $3.7 billion mask revenue number, it’s very difficult to see a blip from EUV—yet. But certainly EUV masks are more expensive. As we increase the number of EUV masks, we can expect to see an increase in the total mask market.

SE: EUV lithography is inching closer to production, but there are still some challenges. For example, there is a lot of talk about problematic variations—also known as stochastic effects—in EUV. Any thoughts?

Fujimura: Finally, the whole discussion about shot noise has come to fruition. People have been talking about it for a while. But basically, people feel comfortable with the immediate-term deployment of EUV. Eventually, the high-volume use of EUV will be maybe 5nm or maybe 3nm where shot noise would be an issue.

SE: The mask industry is getting ready for EUV. For example, mask makers use single-beam e-beam tools to pattern the features on a photomask. But the write times continue to increase for all masks. Now, the industry has developed multi-beam mask writers. Where does multi-beam fit into this equation?

Fujimura: Multi-beam machines open the possibility for any shape to be drawn on the mask. Practically speaking, in the past, we’ve had rectilinear shapes. With multi-beam, we no longer have to think of it that way. This is a breakthrough for OPC and ILT. One can now just output curvilinear shapes if needed. It’s more than that. Because of the nature of how it writes, it can also be helpful for very dense and small designs like EUV masks. So EUV masks, and for that matter, nano-imprint masters, require multi-beam technology.

Fig. 2: Multi-beam uses many beamlets in parallel. Source: IMS

SE: When EUV is inserted, how will this impact mask customers?

Fujimura: From the eBeam Initiative surveys, we can see that turnaround time is a huge issue for mask shops. It’s about to get worse with EUV. Maybe not initially, as single-exposure EUV for the 7+ node will likely have less SRAFs or maybe even no SRAFs. But certainly soon in the future.

SE: There is also a lot of talk about mask process correction (MPC) at various conferences. What is MPC and why is it important in mask making?

Fujimura: Mask process correction is the mask version of OPC or ILT. In order to print what you want the mask to look like, you need to manipulate the shape. Let’s say I ask for a 40nm wide and 200nm tall rectangle on the mask. You don’t get that using a reasonable resist that people would use for production masks. So you might end up with 36nm. And the line might be, instead of 200nm tall, it might be 160nm tall. So, some 40nm lines are 38nm and others are 37nm, depending on the context. Amazingly, that 1nm difference is important in wafer processing, so therefore important on the mask.

SE: What is the goal here?

Fujimura: What would be ideal is if what you asked for is what you actually got on the mask every time. When you need to print features that are small, like you have to for SRAFs, what you ask for in a CAD drawing input isn’t what will end up on the actual mask. This is unless you do MPC to correct the input.

SE: So in simple terms, MPC is a software correction mechanism that enables the desired shape on the mask. Where is this done?

Fujimura: Usually, it’s done with off-line processing of software in the mask data preparation stage. Usually, this is performed between OPC and the mask machine. The data from OPC comes out. And there is a stage called mask data preparation. In mask data preparation, what used to happen is just pure fracturing. Now, it’s fracturing and MPC. It’s not just moving data around and dividing the shape in rectangles. You have to do some processing. Some of the processing can be done with ruled based. Some of the processing is done using simulation-based processing. Simulation-based processing is further classified into two types. One is simulation using empirical models. The other is simulation using physical models. D2S subscribes to the physical-model philosophy. We know that empirical models can over-fit to the test data. Physical models are much better about accuracy in the face of real designs. The other way to do MPC is with in-line correction inside the mask writer as is done in the NuFlare MBM-1000 using its PLDC (Pixel-Level Dose Correction) capability. With that, there is no turnaround time expended to perform MPC because the correction is done as the machine is writing the mask.

SE: There are several ways to perform this function, right?

Fujimura: The typical EDA model is that customers would have a farm of CPUs. Sometimes, it’s GPUs, but mostly CPUs. And then, the software is supplied from an independent vendor. We have a different model. We are committed to GPU acceleration. We supply our own computational design platform.

Fig. 3: eBeam Initiative Survey: MPC Becomes a Requirement Below 16nm

SE: Finally, the mask industry appears to be stepping up and meeting the challenges for the next nodes, right?

Fujimura: The mask industry is finally getting recognized for all of the technology advancements in keeping up with the ever-increasing demand in precision and accuracy with turnaround time requirements.

Related Stories
More Lithography/Mask Challenges
EUV’s New Problem Areas
Security Holes In Machine Learning And AI
Next-Gen Mask Writer Race Begins


Chuck says:

“7-sigma is pessimistically possible, but not pragmatically probable” and we’ll use every bit of 5.

Shuhai Fan says:

In terms of OPC modeling(traditional or ML) and direct CAD2Mask(ML from CAD&Wafer data), they need wafer image(or CD) to close the loop. But SEM images suffer from e-beam damage to resist (resist shrinkage, heating effect and etc.). Although metrology error (including shrinkage, process variation) can be term(s) in OPC modeling and seems solvable, but I believe pre-shrinkage contour/CD plus CD normalization will benefit OPC model accuracy greatly.
If we get enough computing power, idea mask writing pattern could be achievable directly from
large data set of CAD and corresponding post full process pattern (post all patterning steps: litho, etch, CMP, CVD/PVD…). But I think that for the far far future. After all, we still need real CD/Contour to monitor litho process.

Leave a Reply

(Note: This name will be displayed publicly)