The different classes of AI implementations in the EDA market, where they should be used, and what sort of results to expect from them.
Chip design projects are notorious for generating huge amounts of design data. The design process calls for a dozen or more electronic design automation (EDA) software tools to be run in sequence. Together, they write out hundreds of gigabytes of intermediate data on the way to creating a final layout for manufacturing. Traditionally, this has been seen as a problem. But this richness of data is now increasingly seen as a great opportunity for applying AI/ML (artificial intelligence/machine learning) techniques and launch EDA on its next great leap forward in productivity.
Indeed, all major EDA suppliers are investing heavily to create AI-enhanced design tools. While the ‘AI/ML’ umbrella terminology is useful, it obscures some fundamentally different approaches and choices on how AI techniques can be applied to EDA. This paper seeks to clarify some of these categories, describe trade-offs, and weigh the pros and cons of different AI implementations that are being used today.
I have noticed at least 3 distinctions or classes of AI implementations that are being offered in the EDA market. These several types are:
Each of these approaches has advantages and weaknesses and it seems reasonable to use multiple of them at the same time. They are not necessarily mutually exclusive. Nevertheless, I believe it is important to gain a clearer understanding of what questions they can address, where they should be used, and what sort of results you can expect from them.
The purpose of system optimization AI is to optimize architectural choices in a true system with many different tools (system: “a group of interacting, interrelated, or interdependent elements forming a complex whole”). System designers face combinational explosions in their architectural choices which leaves them adrift in a sea of possibilities and makes it exceedingly difficult to find an optimal combination of design parameter values. This is made even more difficult because parameter interactions are usually highly non-linear. System-oriented tools are often only weakly tied to the precise nature of the system and are designed to be broadly applicable to a whole category of problems. The goal for this class of AI tool is to find the best engineering trade-off between many disparate design choices to optimize target results. An example application could be how to optimize through-silicon via (TSV) spacing and power bump pitch to optimize mechanical warpage, max temperature, and worst IR-drop for a 2.5D assembly with 12 chiplets and an interposer.
Ansys optiSLang system AI builds meta-models to predict the full range of system behavior based on a small number of full-accuracy simulation runs.
System optimization AI is the oldest type of AI implementation in EDA, with mature tools like Ansys optiSLang on the market since before 2019 and more recent entries from Synopsys (3DSO.ai) and Cadence (Optimality).
At a level below the system perspective, AI can also be applied to individual tools or processes in the design flow with the goal of improving their run-time, memory usage, or quality of results. When looking at individual AI-enhanced EDA tools, one of the most important distinctions is between AI implemented outside the tool and AI implemented inside the tool. The AI-outside approach is essentially an auxiliary AI wrapper that seeks to drive the EDA tool to better results by sweeping input parameter values and through guided exploration of the result space – but the core algorithms in the tool remain unchanged. A clear indication of this style of AI is that any of the AI-derived results can also be obtained through a traditional non-AI execution of the tool with the same input parameter settings.
AI-outside tries to find better values for input parameters that will drive the tool to better results. AI-inside upgrades the actual core algorithms to operate with AI insights and techniques.
Both Cadence (Cerebrus, InsightAI, etc.) and Synopsys (DSO.ai, VSO.ai, etc.) have opted mainly for AI outside their core tools, sometimes called top-down AI. An advantage of this approach is that it can quickly be replicated across a full range of tools because it is fairly generic and does not require any changes to the sensitive core algorithms inside a tool, like the router or the placer. Its value lies in quickly finding the best possible result that the tool can deliver. The tool itself is unchanged and was always capable of these results. AI-outside simply resolves the practical problem of driving these extremely complex EDA tools to a good result but, on the downside, it doesn’t actually make the tool itself any faster or any better. This severely limits the scope of benefits that AI-outside can provide. You quickly hit a ceiling where the results match those that an experienced engineer could already achieve. And that’s all the benefit you can ever get from an AI driving an essentially unchanged tool.
AI on the inside is a fundamentally different approach favored by Ansys where the core simulation and analysis algorithms are modified to operate with new AI understanding and new AI guidance. This serves to make the core engines run faster and give better results in terms of speed, capacity, and accuracy-vs-time efficiency. AI-inside achieves these benefits for all users and without any changes to the use model for the customer. A good example of AI-inside is the thermal simulation engine in RedHawk-SC Electrothermal for 3DIC analysis. Thermal simulation requires the creation of a finite element mesh as a first step. The finer the mesh the more accurate the result, but the simulation also takes longer. Ansys’ thermal engine is able to build an adaptive mesh that is fine only where it needs to be, around thermal hotspots, and is coarser elsewhere where a fine mesh is unnecessary. The problem with this approach is how to know ahead of time where these hotspots are located? AI offers a perfect solution for this because it can very quickly estimate a rough temperature distribution that is good enough to guide the adaptive mesh builder. The benefit is that it makes thermal simulation much faster without sacrificing any appreciable accuracy. This sort of enhancement under-the-hood is sometimes called bottom-up AI and it improves the fundamental operation of the tool any time thermal simulation is done in whatever context.
Ansys’ ML-based solver for thermal maps is an example of AI that is both inside the EDA tool and also physics-based. This approach speeds up thermal map generation by 100X. Notice how the AI understands that the total heat flux into each sub-domain must be conserved. It is not just blindly running numbers through a generic neural network – it understands what the data means and also the physics that constrains it.
Another example of Ansys’ AI-inside approach is in RedHawk-SC’s reporting of dynamic voltage drop (DVD) analytics. A DVD violation may be attributed to 25 or 30 aggressor cells nearby, but the probability that they all switch at the same time is very small – so what is the best sub-set of aggressors to select for ECO? This aggressor selection problem suffers from a combinational explosion of possibilities (225). But by applying physics-based AI analysis, RedHawk-SC is able to quickly identify the critical set of aggressors for ECO optimization without the need for extensive simulation runs. Once again, this speeds up the tool in a fundamental way for all users.
It is clear from the above that AI-outside is a useful approach, but its benefits are limited to what the tool can already achieve. AI-inside is a way to make the tool run faster, use less memory, and achieve higher accuracy results. The two techniques can, of course, coexist as they are tackling different questions.
Most AI algorithms are very generic and have no understanding of the data they are working on: an AI algorithm trained to recognize cat pictures has no understanding of cats. It could equally well be trained to recognize a squid. This is what I call a non-physical AI – it has no understanding of the underlying physics. Many of the more generic implementations of AI in EDA rely on this type of non-physical AI.
Ansys’ core strength is its incredibly wide range of physics simulation engines. That is why it has sought to leverage this know-how by creating physics-based AI solutions. When I say ‘physics based’ I mean that the AI engine is imbued with an understanding of some fundamental causal relationships between different data. This opens the door to extracting much deeper and richer conclusions. If, for argument’s sake, the cat picture AI understands that cats are mammals and hence must have fur, then it will never choose a picture of a squid, no matter how much it looks like a cat.
Take as an example two logic gates on a chip that both see the same 100mV voltage drop. A non-physical AI would treat them similarly in trying to solve what looks like essentially the same violation. But Ansys’s physics-based AI considers multiple physically related parameters and understands that 70mV of the first gate’s voltage drop is due to non-local reasons, which is best solved by augmenting the power distribution network. But, for the second gate, 80mV of drop is linked to local aggressor cells switching at the same time, which calls for a totally different fixing approach that spreads the placement of these cells further apart.
Ansys RedHawk-SC uses ML-based clustering in its SigmaDVD technology for dynamic voltage drop (DVD) analysis. The clustering algorithm identifies interacting cell patterns and excludes non-interacting aggressors. This information is used to drive early placement optimization that reduces IR-drop violations by over 90%.
Physics-based AIs understand the relationships and causal connections between the data and so can come up with solutions tailored for each case rather than a one-size-fits-all conclusion. This approach to AI is grounded in unchanging physics equations which guides and constrains them and makes them more robust and reliable. This makes hallucinations much less of a problem because the AI understands the limits of the possible. It understands, for example, Kirchoff’s Current Law which requires that the current flowing out from a node must equal the current flowing in – that relationship can never be broken. Even when presented with a novel situation that it has never seen before, a physics-based AI can still take direction from these fundamental rules and arrive at a reasonable conclusion. This is much more difficult for non-physical AIs that rely completely on their training of historical data which may not be a fully reliable guide to unexpected new situations. I believe that when dealing with physical phenomena like electronics, thermal, and fluid flows, there are huge benefits for AI algorithms that understand the basic laws that govern the data they are analyzing. It will help make predictions more reliable for signoff, avoid hallucinations, and lead to more insightful solutions that will give the best results.
The EDA industry has taken an exciting journey as it looks to realize the promise of AI for chip designers. As we have all travelled down this road, it has become clearer what range of choices lie ahead and what trade-offs can be made. I have outlined 3 axes for evaluating the usability of various AI implementations with the goal of choosing the correct type for the benefit you are seeking. I believe that all approaches have shown value in the right context. It is not an either/or choice and they can, and should, be used together to maximize the overall benefit of AI in EDA design flows.
Leave a Reply