The convergence of AI/ML and GPU advancements are creating new opportunities for faster processing.
Experts at the Table: Semiconductor Engineering sat down to discuss the impact of GPU acceleration on mask design and production and other process technologies, with Aki Fujimura, CEO of D2S; Youping Zhang, head of ASML Brion; Yalin Xiong, senior vice president and general manager of the BBP and reticle products division at KLA; and Kostas Adam, vice president of engineering at Synopsys. What follows are excerpts of that conversation. To view part one of the discussion, click here.
L-R: D2S’ Fujimura; ASML’s Zhang; KLA’s Xiong; Synopsys’ Adam
SE: There seems to be a parallel growth between the adoption of machine learning and GPUs. Is the need for machine learning driving GPU adoption, or are GPUs creating the opportunity to embrace machine learning?
Adam: The AI/ML revolution is a significant driving force for the development of GPUs, and major advances in AI/ML are enabled by more powerful GPUs. This influence is now widely recognized, given the substantial momentum it has created. As a consequence, there’s been an uptick in activity within our industry, including the customization of hardware, an increase in the number of chips being taped out, and a rise in the complexity of the nodes. We are currently dealing with a situation where the complexity of processing a single mask has notably increased. We measure this computational complexity in CPU hours required to process a square millimeter of a mask. The greater the number of CPU hours needed, the higher the associated costs. As the computational challenges grow, this metric continues to rise. Moreover, these complex solutions are now required for multiple layers, not just a single mask, adding further to the intricacy of our tasks.
In response, many of us are leveraging machine learning to effectively manage and navigate this complexity. By incorporating ML, we can accelerate the physics simulations we rely on. Essentially, we use machine learning to learn from these physical simulations and then apply faster ML models that have been trained on these simulators. This allows us to expedite the entire simulation process and more quickly arrive at final solutions. We run intensive simulations on sampled results, learn from these through deep learning models, and subsequently apply this knowledge to expedite processing across multiple masks. Thus, while we are not the primary drivers of AI and ML growth, we are certainly benefitting from these technologies, smartly adopting and adapting them to meet our specific needs in the semiconductor industry.
Zhang: GPUs have made the wider use of machine learning and AI much more practical. When you consider CPU-based training for something like complex neural networks, it’s always been a painfully slow process, and would be limited to rare uses. However, with GPUs, these tasks can be completed in just a couple of hours, which dramatically changes the feasibility of different applications. It’s much more feasible to engage in a variety of projects, thanks to the reduced cost and increased efficiency of GPUs. This efficiency is enabling us to be more exploratory in our endeavors. It’s now practical to apply machine learning to a broader range of applications because the cost barrier is significantly lower. All vendors have embraced this shift, incorporating machine learning models and algorithms across various applications. The rising costs and complexity would have made this nearly impossible without the aid of GPUs. As for handling complex calculations, the approach has shifted. Previously, one might not consider doing certain types of computation-intensive tasks due to their prohibitive costs. Now, with GPUs, we can either offload these tasks to speed them up significantly, or train sophisticated neural models for better results. This makes machine learning applications not just practical, but also cost-effective and widely accessible.
Xiong: From the perspective of inspection equipment, the primary reason for leveraging GPUs is centered around deep learning applications. In the past, the tradeoff between flexibility and cost for GPUs wasn’t as clear-cut. However, as deep learning has increasingly been integrated into inspection algorithms, the decision to utilize GPUs has become an obvious one. Deep learning requires a lot of computational power, which GPUs are well-equipped to provide. This allows us to process complex data and perform intricate analyses much more efficiently.
Fujimura: In general, GPUs are utilized across a broad spectrum of applications, including simulation of nature and image processing. But you’re right that the rise of deep learning has significantly accelerated the adoption of GPUs. Xiong’s example with inspection is a good one for automatic categorization, among other things. Humans can tire, but machines do not, which boosts efficiency and reduces errors. In recognition tasks, like identifying defects or categorizing objects, deep learning has been pivotal. Six years ago, for instance, distinguishing images of cats and dogs served as classic examples of deep learning applications. Similarly, recognizing different types of defects in materials or products has been greatly enhanced by deep learning, illustrating its practical value. I agree with Adam and Zhang, too, that training of the deep learning networks is particularly enhanced by GPU-based computing. Deep learning has proven to be useful particularly in software that require iterative optimization. For example, in processes that might normally take 50 iterations, deep learning can expedite this by allowing us to skip the first 30 iterations through inferencing, making the process much faster.
Addressing the question of whether the need for machine learning drove GPU adoption or vice versa — the chicken-or-egg scenario — deep learning was enabled by the availability of economically viable GPUs. Affordable GPUs have democratized computing, making powerful processing accessible to a wide range of users. For example, where one Cray-2 supercomputer, rated at 1.9 GFLOPS might have cost $1.5 million 40 years ago, today researchers can purchase a 42,600 single-precision GFLOPS GPU for a personal computer for around $2,000. This accessibility has been a game-changer, empowering more individuals and organizations to engage in advanced computational tasks, including simulations of natural processes. Neural networks are natural processes in our brains, too, so it makes sense that GPUs are also good at deep learning. This democratization of computing has not only propelled the advancement of deep learning, but has also transformed the landscape of technological innovation.
SE: What’s necessary for manufacturers to incorporate GPUs into the process line, and what are the challenges?
Fujimura: From my perspective, incorporating GPUs into manufacturing lines presents minimal challenges, especially when it comes to software vendors. We all are expected to provide support for GPUs, and already use them extensively in our processes. Equipment vendors are also on board. They’ve integrated GPUs seamlessly into their setups, as evidenced by companies like TSMC. In terms of hardware integration, manufacturers simply need to add servers or other hardware capable of supporting GPU technology into their processes. For example, at a recent Synopsys Tech Forum, TSMC discussed their substantial GPU infrastructure, highlighting their ‘large farm of GPUs’ and their plans to expand this capability further due to the benefits GPUs offer, especially for processing that involves curvilinear mask shapes. Overall, the main requirement for manufacturers looking to efficiently handle curvilinear mask shapes is to use GPU acceleration, which many have already begun to do effectively.
Zhang: GPUs are already widely utilized across various applications, from deep learning model training to inspection applications. However, the decision to integrate GPUs into a production environment hinges on several factors. Economically, the integration must make economic sense. If adopting GPUs doesn’t reduce costs, there’s often hesitation. There are also broader considerations across applications, such as obsolescence concerns. GPU vendors typically update their hardware every couple of years, phasing out older generations. This rapid cycle can disrupt long-term usability and compatibility, as older models become unavailable and new models might not provide consistent outputs, especially in applications where consistency is critical, like in optical proximity correction (OPC) where the established methods are considered golden. Additionally, there’s the technical aspect of integration. Software initially written for CPUs may need substantial modifications to run effectively on GPUs. This adaptation can require significant effort from both the software vendor and the customer, who may need to rewrite existing code specifically for GPU use. Deciding what processes to offload to the GPU and what to keep on the CPU is a crucial part of this integration, as not all tasks are cost-effective to transition. Moreover, the integration of GPUs involves efficient handling of additional overhead in data communication with CPUs, which were not concerns with CPU-only setups. Before making such transitions, customers need to assess whether the investment will pay off in terms of cost reduction and improved efficiency. While the benefits of GPU integration can be significant, the decision to adopt this technology involves careful consideration of its economic justification, the potential for obsolescence, the need for software rewrites, and the overall return on investment. These factors collectively contribute to the cautious approach of some organizations regarding the full adoption of GPUs.
Adam: Customers are well-informed and attentive regarding the integration of GPUs in production environments. They are definitely excited, but also cautious. The decision to adopt GPUs hinges significantly on the cost-effectiveness of such a move. Each customer calculates their own cost of ownership and the point at which investing in GPUs becomes financially viable, considering various factors beyond the mere cost comparison of CPUs versus GPUs. One critical aspect they consider is the long-term stability of their production lines. For instance, if a production line is expected to run stably for 10 years, customers need to ensure they can source the required GPUs throughout that period. Additionally, the transition involves significant software adaptations. Customers need to assess the investment required to adapt their software frameworks for GPU compatibility. This includes considering the programming layers offered by different Electronic Design Automation (EDA) vendors and determining which one aligns best with their needs. It’s important also to consider applications that require high stability over long periods, such as Optical Proximity Correction (OPC) and Inverse Lithography Technology (ILT). GPUs are already widely accepted for tasks like training models and image-to-image comparisons due to their unmatched computational capabilities, but the decision to fully commit to GPUs for more critical applications like OPC or ILT is still under deliberation.
SE: Are there any areas where you see an opportunity for GPU acceleration but where it’s not being used yet?
Adam: There are some innovative ideas we’re considering for future GPU acceleration applications. However, the realization of these ideas is contingent upon several factors that still need to unfold. Our teams, which consist of exceptionally talented individuals, are not only planning for the immediate future but are also looking five or more years ahead. As GPU architecture continues to evolve and adoption increases, there will undoubtedly be new and intriguing opportunities that emerge. These developments are promising, and we are keenly observing the landscape to identify areas where GPU acceleration can be further leveraged.
Zhang: The overarching promise of AI involves significant shifts in operational processes. Specifically, we’re looking at how AI, powered by the computational capabilities of GPUs, could potentially replace tasks currently performed by humans. This isn’t just about substituting CPUs with GPUs. It’s about leveraging the enhanced processing power of GPUs in conjunction with advances in machine learning to automate complex tasks. While this transition may still be some time away, the pace at which technology evolves suggests it could happen sooner than anticipated. We’ve seen rapid transformations before, similar to how mobile technology rapidly evolved and significantly altered many aspects of daily life and business operations. Some industries have already started to adopt such AI-driven applications. Therefore, there is a real opportunity here to consider how we can further harness GPU-accelerated AI to replace or augment human work in certain areas, potentially leading to greater efficiency and innovation in various fields.
Xiong: The popularity of machine learning in recent years underscores the presence of numerous opportunities. It’s not so much a question of whether opportunities exist, but rather a matter of prioritizing them. While some applications are at the forefront of development due to their immediate relevance or potential impact, this does not mean that other areas are unknown. They are simply deemed secondary or tertiary in priority. The ongoing interest and daily contemplation by professionals across various fields reflect a widespread acknowledgment of machine learning’s potential. Everyone is thinking about how to integrate these technologies into their operations, which suggests that as priorities shift and evolve, so too will the applications of machine learning in less explored areas.
Fujimura: The prevailing method in semiconductor mask making is largely based on the traditional Manhattan style, but there’s a consensus that a shift toward curvilinear mask making is on the horizon. This transition is likely to necessitate the use of GPUs, given their superior processing power and capability to handle complex calculations. This move to curvilinear design not only involves a shift in the technical approach but also in the computational mathematics that support these designs — what I refer to as ‘useful waste.’ Moreover, the increasing applications of deep learning across various fields further amplify the importance of GPUs. Deep learning inherently demands substantial computational resources, which GPUs are well-equipped to provide. Thus, as we make this shift in the mask making world and continue to expand the use of deep learning, the value and necessity of GPUs in electronic design automation (EDA) become even more pronounced. This is a critical transition, one that underscores the evolving needs of our industry and highlights the growing role of GPUs in meeting these demands.
Read part one of the discussion:
Navigating The GPU Revolution
Potential cost and time benefits are driving GPU adoption, despite challenges.
Leave a Reply