Why the chip industry is so focused on large language models for designing and manufacturing chips, and what problems need to be solved to realize those plans.
AI holds the potential to change how companies interact throughout the global semiconductor ecosystem, gluing together different data types and processes that can be shared between companies that in the past had little or no direct connections.
Chipmakers always have used abstraction layers to see the bigger picture of how the various components of a chip go together, allowing them to pinpoint potential problems and fixes much faster than with a divide-and-conquer engineering approach. In fact, most issues are identified at the earliest stage of the design flow, where they require the least amount of time and effort to fix.
But not every problem is caught, even with the best of tools. That’s why so much of the design process is spent on verification and debug, and an increasing number of test and inspection insertion points during manufacturing and packaging. Despite these efforts, some devices still fail in the field. At best, they can be patched with software, plugging security holes or mitigating excessive battery drain. At worst, tracing problems such as silent data corruption can take months to pinpoint. And some problems may be process-specific, such as pitting or warpage in a substrate, unexpected material breakdown due to heat in one or more processes in the fab, or unexpected interactions between chiplets in a domain-specific design.
On the positive side, there is no shortage of data to sort through when problems do occur. Gigabytes to terabytes of data are compiled at nearly every process step. The problem is that data is often unique to specific processes, and not necessarily useful for other process steps. For example, inspection data is very different from electrical test or vibration data, or thermal measurements taken inside of chambers, and data analytics may focus only on one or two measurements to identify problems. Typically, much of that data is abstracted, making it even more difficult to identify and assess problems.
This is where large language models fit in. LLMs allow any data or data type to be shared, from architectural exploration and floor-planning all the way to final test in complex systems. It’s also why there is so much buzz around LLMs across the chip industry. Unlike typical abstraction layers, which are one or more steps removed from whatever is happening in a particular process, LLM abstractions can be horizontal and all-encompassing, identifying patterns that are not obvious between different process steps and workflows. That makes it possible for engineers working at any step in the design-through-manufacturing flow, or even those tracing problems in the field, to tap into relevant data from one end of the ecosystem to the other.
Fig. 1: Where LLMs fit into the AI landscape. Source: Tignis
“Generative AI is allowing us to dream big for a number of problems,” said Stelios Diamantidis, head of the Generative AI Center of Excellence at Synopsys. “If you think about chip design flows, we’ve been bringing AI technology that has been helping us with optimization using reinforcement learning, being able to get to solutions faster and with less effort. Generative AI augments that. First, it helps with bringing up new talent. It helps us get to information faster. It helps us train people and share across the team through turning user interfaces into natural language interfaces. That’s the first impact point. Further down the road, the entire design flow can come together through generative AI, essentially becoming the glue for all these capabilities through its ability to do some planning and some reasoning and solve problems at a high level. That’s where the industry ultimately will move.”
And this is where things can get especially interesting for the chip industry. “We use language in all design abstractions, and large language models already help us connect a lot of these knowledge points, but going forward those models will evolve,” Diamantidis said. “Maybe we’re going to talk about large language models for other kinds of data — abstractions that will help us connect the dots even more intrinsically across the design cycle. So you’ll be able to pull information from manufacturing and correlate it with data from architecture, for example, because you’ve built semantic spaces around those data abstractions.”
Fig. 2: Second wave of AI using generative AI models. Source: Synopsys/Hot Chips 2024
That view is being echoed across other parts of the IC ecosystem. “The whole idea is to expand our focus on wafer sort and package test and move back into areas like post-silicon validation, with strong linkages into EDA — pushing out into system-level test and system test, and then being able to provide solutions across the entire spectrum,” said Ira Leventhal, vice president of applied research and technology at Advantest. “It’s being able to test at wafer, wafer sort, wafer acceptance, parametric test, and then even back into post-silicon validation, and then moving forward from the wafer. And before you get into the final package, there can be a singulated die test, or you can build up some sort of interim structure that is going to then go into the final substrate. Each of these is a test insertion for which our customers need to have those capabilities and put them together in various ways, depending on the particular application.”
This becomes particularly attractive with advanced packaging. “It becomes a lot more interesting when you have the kind of disaggregated, heterogeneous integration with chiplets from everybody,” said Nitza Basoco, technology and marketing director at Teradyne. “And so now we’re seeing groups of companies getting together and working on this. If they all agree that you can run these kinds of things, then you can come back and say, ‘This is where I might be having an issue.’ Before this, a lot of people were worried that if they share too much information, it might be used against them. But at the same time, if you really need to figure out where a problem is coming from — because this is not just one vendor — then we need to be proactive about how we’re going to find these things.”
It’s a revolutionary idea, but it only works if three main challenges can be solved:
Key to all of this is developing sufficient rules and standards to establish trust between different companies, which in a highly competitive global industry is non-trivial.
“All of these things are data-driven,” said Kunle Olukotun, co-founder and chief architect at SambaNova Systems. “If I’m Company A, and I understand something about my designs, do I really want the information to go to Company B through my EDA vendor? If you don’t have a repository of design data, you can’t train a good model. The EDA companies can train good models without the design data of their customers. That’s the fundamental wrinkle that needs to be solved specifically for chip design.”
End-to-end data
Assuming the data sharing and security issues can be solved — and currently there is much discussion underway across the semiconductor industry about how to achieve that — then the impact on every segment, from design through manufacturing, could be significant.
“The orchestration of data is one of our first targets, because in the test world you have three to eight wafer sort test steps, then packaging, and then a number of final tests and system-level and early system-level tests,” said John Kibarian, CEO of PDF Solutions. “In the end you want to take parameters extracted from one insertion point, extract figures of merit, and pass that down to the next to make a control decision at that next point that is informed by both upstream and downstream data. And you want to do that across the production flow, because often there’s one facility where the wafer sort is done, and another where the final test and potentially system-level tests are done. And customers that are now in the business of building boards or cards or entire systems will have a third system-level test capability at a company like a Foxconn.”
The next step is to be able to tap into LLMs in a controlled way. So rather than sharing all data, existing tools need to be able to partition that data so that only specific data gets shared. The goal is to tie existing tools and processes into the LLMs only where it makes sense, and to prevent proprietary data from leaking out. This is a significant challenge, but it’s one that needs to be solved to make all this work.
“Like a big arc, the trend we’re seeing for customers is more and more things are moving to where the data is,” Kibarian said. “That could be a partnership with K&S on ML for wire-bonding, where you move the algorithms to the bonder. Otherwise, to get the data from the wire bonder up to a central computer on a floor that has 100 wire bonders creates an outrageous amount of data traffic on the network. It’s a lot easier to go the other way.”
To achieve this, what’s needed is a model that is flexible and can glue together a lot of different IPs and relate them in certain ways, as opposed to creating a model that attempts to do everything at once. “But this is going to take some thinking,” said Chris Mueth, new opportunities business manager at Keysight. “There are some model types that try to do some of that, but they’re not mainstream. They’re unique to different vendors. Anytime you have something unique to a vendor, it never propagates. However, even before we get there, we must tie the domains together a little closer because you can invent the model, but if you don’t have the domains tied together, no one is going to use it. It’s a tricky little problem to try to get the two domains to talk together, to exchange IP. As a vendor that spans both worlds, it’s not easy, and it wouldn’t be easy among different vendors in different domains.”
And while much of the buzz is around LLMs, there are other pieces of the data puzzle that need to be sorted out to connect everything using LLMs and other AI technologies.
“The blocker to this vision is less about technology and more just about process and data silos and organizational issues,” said Jon Herlocker, president and CEO of Tignis. “As an industry we need to squeeze more and more blood out of the lemon, and the only way to do that is to look deeper at the data. But some of the challenges are just about data infrastructure. The hype around LLMs is getting everyone excited about the idea of doing smarter data analysis, but today it’s not the LLMs that will make a difference. The real challenge is overcoming the logistical, internal process barriers to start handing over data to each other. Those data schemas are often different, and getting them aligned is painful.”
More efficiency needed
Two other big challenges need to be addressed, as well. The first is how to efficiently deal with a flood of data, and that comes down to the architecture of the systems used to process that data, from memory and processors to the PHYs used to connect them.
The second challenge is how to reduce the number of generative AI queries, which use an enormous amount of energy. Trillions of LLM queries are unsustainable from an energy grid and use standpoint, so the results from each query need to be more accurate, more tightly defined, and more processing of data needs to happen locally. This is basically like partitioning work between the edge and the cloud, where the LLMs are created, maintained, and updated in the hyperscaler data centers, while much of the more narrowly focused machine learning algorithms are being utilized at the edge.
The upside, at least from a cost perspective, is that AI/ML can make engineers more productive, which is essential given the talent shortage facing the chip industry. But it also means that engineers will need to understand much broader portions of the ecosystem, rather than working in a narrow slice of the design or manufacturing flow.
“Society’s dependence on us has radically shifted,” said Mike Ellow, CEO for Silicon Systems at Siemens Digital Industries Software. “They need us to be successful, and semiconductors need an infrastructure that is reliable, resilient, and a way to do that with fewer engineers graduating from universities. So how are you going to get a multiplying effect to early career and veteran engineers to address the challenges in the volumes of designs? This is not about the volumes of chips, which is the number of designs that have to happen over time in anticipation of where AI can take into the future? In the past, you could see design start facilities ramping up and technology nodes were fairly predictable. You had the ups and downs, the boom-bust cycles. But we will hit a state at some point in the future where we are running all out, and we are not going to experience the same booms and busts. We’ll see companies come and go, of course, but some people will fail faster because the time interval between success and failure will be very much compressed.”
Moreover, that compression will happen across industry segments. Advantest’s Leventhal, who runs the Heterogeneous Integration Roadmap’s test data analytics working group, said there is growing momentum behind this kind of cross-segment interaction. “Several people in the working group said, ‘Let’s just start.’ It’s kind of like, ‘Can we rebuild the engine while we’re flying the plane?’ So we’re looking at what can be done with unsupervised learning techniques, or reinforcement learning, where you’re trying to optimize toward a certain goal, as opposed to relying on having a bunch of trained models or time series-based algorithms where you’re looking for unexpected behavior.”
Leventhal noted that the best approach is not to single out which one of these techniques works best, but how to achieve the best results come by combining them. “There’s actually a name for this,” he said. “It’s called deep reinforcement learning. You’ve got a supervised learning model, some sort of trained deep learning model that’s working down at the reward level, and then you’ve got the reinforcement learning algorithm that’s making the decisions about what step to take next. And you can apply unsupervised learning to tie together the multiple steps.”
Impact on chip architectures and technologies
In theory, a combination of LLMs and machine learning should allow for higher yield, improved reliability and performance, and much simpler integration of heterogeneous compute elements in a design.
“There’s a layer of the data and the data ownership, and the ability to share that data when you want it with as little or as much visibility as you want to give to a certain company,” said Teradyne’s Basoco. “That’s one aspect of it. Another aspect is that this starts at the very beginning of the design stages, at the architecture. We’re designing it, we’re running simulations. We’re throwing simulations back and forth and looking at it from a much bigger, broader level than just me and my device. It’s not just me and my chip anymore. It’s me, my chip, the substrates, the interconnects, the other chips. Is it going to be affecting Chip A on the other end of the data path, and how is it going to affect Chip C, which may be nearby but not really interacting with Chiplet A or Chiplet B?”
That requires a lot of coordination across the ecosystem. Just expecting an LLM to solve that is unrealistic. The result, at least in the short term, likely will be a bifurcation of the market into systems companies developing the most advanced chips for training LLMs, and more generalized, flexible, and reusable portions of a design that are narrowly focused.
This is one of the reasons FPGAs and DSPs have been so successful in the past. They provide a cushion for change. And for companies that are racing to get to market first, NVIDIA’s CUDA-based re-usable models have proven very attractive. But as more heterogeneity creeps into designs, these worlds will begin melding together in new ways, and flexibility in the form of chiplets and more standardized components will be required.
“When you think about chiplets, it’s going against the slowdown in Moore’s Law,” said Patrick Soheili, chief strategy and business officer at Eliyan. If I want to build [an NVIDIA] Blackwell chip, I need the highest possible bandwidth, the lowest possible power, and the largest possible bandwidth per millimeter edge and the smallest PHY area in my ASIC. Those are things that really important. All of these types of companies are working at between 5 and 20 terabits per second per millimeter. If you don’t have that, then two GPUs that are connected are not going to act as one chip. You’re going to miss out on latency, power, and performance.”
Those systems of chips will be used to create LLMs, and the LLMs will be used to ensure the individual chiplets, the chips, and entire systems of both work as expected and yield properly, from architecture through final test.
Reality check
This is all still in the future, and while work on LLMs is progressing at astounding speed, not all of the pieces are moving at the same speed. It likely will take years before this end-to-end vision becomes reality, and how complete the data chain ultimately becomes is anyone’s guess at this point.
“It’s important when we look at a topic like this to pump the brakes a little bit and make sure we’re intellectually honest about the problem we’re trying to solve, and the strengths and limitations of the solutions that we’re proposing for those problems,” said Rob Knoth, group director for strategy and new ventures at Cadence. “We’re talking about gluing together different tools, different data formats, all in the spirit of discovering unexpected connections between things to help explain when something went bump in the night and prevent it from happening in the first place. That is a fantastic thing when you’re talking about, say, the debug of a problem, but that’s not all we do in EDA, and that’s not going to magically solve every problem in EDA.”
The amount of data required to design, manufacture, and test complex chips and systems is enormous. “Will LLMs play an important role today and in the future of interfacing between disparate data sets, dramatically changing how we do debug insights? The answer is 100% yes, and we are on that course with our partners, doing real production work in that area,” Knoth said. “Is it going to take the place of engineering software as a whole? Are we no longer going to have to think about data structures in the engineering software itself? Not at all. We can’t just fill up our coffee mugs and let AI just crank on it a little more. If you think about NVIDIA’s Grace Blackwell with its 200 billion transistors — and then you think about the number of interconnects between all those transistors, as well as all the functional waves, the states of ones and zeros on those 200 billion transistors — that is a problem that can’t just be left to the latest chatbot to solve.”
And then there is the problem of the accuracy of LLMs (see related story). “For certain, LLMs will wipe out a whole class of bad writing jobs (cranking out romance novels, writing game summaries of baseball games for websites, creating clickbait copy, writing product descriptions on a shopping website, etc.) and they can be useful in quickly summarizing content to give the introductory reader a gist of a dense subject,” said Steve Roddy, chief marketing officer at Quadric. “But by definition, LLMs make stuff up, so at best they are an adjunct to real people with real knowledge doing the real work. I personally fail to see how an error-prone tool could help a test engineer refine test patterns on the assembly and test floor, or guide a process engineer how to tweak the recipe on a fab tool.”
Conclusion
At the very least, the buzz around LLMs — and the surprise release of chatGPT two years ago — has let the chip industry ponder the possibilities and value of connecting data. Everyone involved agrees this has a long way to go before it lives up to the hype surrounding it.
“You have to start classifying all of your information and tagging it to say, ‘This document is useful to use in these contexts,” Herlocker explained. “So you start tagging all your information to be used in these contexts, and then the query system takes your query and translates that to sort the tags, which you then feed into your large language model to answer the question. That’s the best pattern practice for using large language models today, but it takes a lot of work. It’s not just like I can throw all my data into a large language model and ship it to you.”
Even figuring out which is the best path forward remains a challenge given the flood of possibilities and rapid changes in AI. “I don’t think that the people who built the transformer envisioned that we’d still be using transformers at this point,” said Patrick Donnelly, solutions architect at Expedera. “And I don’t know if the people who developed convolutional nets thought that would be what we would use for vision. Maybe we are going to use transformers going forward, but a lot of people who look at it critically would say this is not the most efficient way to do these large language models. Or, if we’re going to develop these more specific language models that can do these more nuanced tasks, maybe we need a different approach. Or if we’re wedded to this approach, we need to tailor this approach to a bunch of different applications. So whether it’s changing the basic architecture of LLMs or extending that architecture, there’s a lot that needs to be done on the algorithmic development front before we solve these trickier problems.”
Rob Aitken, currently program manager for the National Advanced Packaging Manufacturing Program in the U.S. Commerce Department, said his personal observation is that these large language models need to be of sufficient size to do something useful. “Once you get to 70 billion or 100 billion parameters, or somewhere in that neighborhood size, then you have a model that does pretty cool things. And then you can start saying, ‘Well, let’s have multiple models and let’s have them talk to each other and learn things and do stuff.’ It’s still early days. There are people deep in the weeds who say we’re progressing to something defined. But to me, it looks like there’s a lot of experimentation around this, and at some point some set of things will pop out and we’ll say, ‘Oh yeah, this is the way to go.'”
Fig. 3: Advantages of AI-assisted digital implementation in EDA. Source: Synopsys
At this point the possibilities seem limitless, but with lots of caveats. “We truly believe we’re on the cusp of something amazing right now,” said Ravi Subramanian, general manager of the Systems Design Group at Synopsys. “It’s going to be a lot of hard work to kind of bend the universe, but the value of solving it becomes greater and greater. What we’re seeing, as you look at the evolution of large language models — multi-modal models — is that it’s not just about language. It could be about image and speech. You could have biological models. But one of the key problems is how long it takes to train these models, how much energy will be required. That starts becoming an economic question.”
Related Reading
RAG-Enabled AI Stops Hallucinations, Adds Sources
New GenAI method enables better answers and performs more functions.
AI/ML’s Role In Design And Test Expands
But it’s not always clear where it works best or how it will impact design-to-test time.
Dealing With AI/ML Uncertainty
How neural network-based AI systems perform under the hood is currently unknown, but the industry is finding ways to live with a black box.
Leave a Reply