Systems & Design

AI Agents Need Goals

AI cannot optimize unless it can measure progress towards goals, but defining those goals is not easy, especially when looking at the entire development flow.

April 16th, 2025 - By: Brian Bailey

Experts At The Table: Definitions and goals matter when it comes to using AI effectively, and it has to be tightly reined in to be effective. Semiconductor Engineering sat down with a panel of experts to discuss these issues and others, including Johannes Stahl, senior director of product line management for the Systems Design Group at Synopsys; Michael Young, director of product marketing for Cadence; William Wang, founder and CEO of ChipAgents and professor at the University of California, Santa Barbara; Theodore Wilson, a verification expert pushing for this development; and Michael Munsey, vice president of semiconductor industry for Siemens Digital Industries Software. What follows are excerpts from that conversation. Part one of this discussion can be found here.

L-R: Cadence’s Young; Synopsys’ Stahl; Siemens’ Munsey; ChipAgents’ Wang; Theodore Wilson.

SE: AI needs to have optimization goals. Semiconductor Engineering has written about the difficulties associated with defining the optimization goals are for an entire system. How important is power? How does that relate to size? How does that relate to time-to-market? All of these are goals in the development and verification process that humans have difficulty weighing against each other. If we can’t codify them in some way, then AI can’t target them.

Wilson: One of the things I like about an AI system is that it doesn’t get bored. We can build some good dashboards. Individual project teams can do this. Central engineering teams can do this. But AI will not get bored looking at these dashboards, and it can help focus attention. The AI can be looking at a statistical analysis that otherwise would not be funded, because we didn’t have to pay someone do that statistical analysis. There’s an opportunity for an AI to look at all this boring stuff. I don’t know if you want to keep paying your engineers to do this. It looks like they’re doing this quite a bit. A solution in this space that answers these very mundane questions, or assist in a very mundane way, would be widely adopted. That will go directly to secure schedule and quality. It also goes directly to improving the nature of work for individual contributors. Under statistical analysis, you’re gradually removing distractions or annoyances from them, because when you make an update to the architectural spec of the design, we see all kinds of compilation failures at the system level and subsystem level, because you haven’t been able to bring the interface changes through cleanly. This seems to happen fairly often, and it’s quite disruptive. Spend some calories on that and that disruption will go away. Or, find a better way to stage architectural changes or base changes. There are many kinds of mundane things. New engineers — what do they struggle with? Are they spending time just trying to compile, or are they spending time resolving issues with updated firmware that’s being run together with the hardware. At a very mundane level, EDA tools provide amazing reports. I feel privileged that I have all these things. There are amazing X propagation reports, reports on race conditions, reports on debug access to the design, all of these things. There’s a very rich data set to mine, to directly impact people on a day-to-day basis and help the programs that are executing today.

Stahl: A rich data set is probably the exact right description because of the very diverse aspects of verification of the design, of the properties of the design you’re looking at. Every small team, or more importantly every large design organization, have their own methods for harvesting the database that comes out of those EDA tools. These environments are so diverse that the chances of harmonizing them is very slim. These companies have developed their views and their methods over many years. Every company is doing it in some way and there could be improvements, but harmonizing that across the industry is very difficult because it’s a very broad, rich data set.

Munsey: There are two factors at play. First, it is very easy to work toward a single goal. For example, if somebody is given a power spec and a power target to meet, that becomes the goal. But what more could have been done to make it even better? That’s a very difficult question to answer because we have to worry about multi-objective optimization. This is where AI starts to play a very important role. You have to look broader than just the single target. This goes back to the importance of the digital twin. When you’re doing multi-objective optimizations, you need to look past that target and look across all the objectives you’re trying to meet, and even these do not exist in a vacuum.

Wang: Maybe the first step, using power as an example, is to define a good definition for power. What is a good metric for power, because we could be talking about energy, peak power, average power, glitch power, and so many other things related to it? Right now, there’s a disagreement about how you should measure power in this entire vertical. Once we have a clear definition about the things we want to optimize, then AI has an idea. That’s the metric. You are getting feedback from certain tools, from engineers, from users, and from customers. They give you feedback, and you can optimize using AI. For some of the metrics, especially power, there is no good agreement. What is the status of PPA, because there are different ways, different stages in the development flow that you can measure power? What exactly is the thing that matters? Is it at the system level? Is it at the RTL level? Is it at the gate level? How do you integrate all this information? You have a lot of measurements, but what exactly is the metric that matters to your customer?

SE: It is important that we do not think about defining a single methodology or a single flow. You want every company to be able to develop and optimize their own flow, their own iteration loops, based on their designs and their engineering knowledge.

Munsey: It’s not about one methodology. You want the flexibility to pull in the right tools, or the right design team, or the right verification team, looking at the problem at hand. This goes back to the digital twin. What you have in this virtual environment is the flexibility that allows you to plug-and-play the best tool, the best methodology, the best people throughout the development flow. How do you make those decisions? First, you have to make those decisions in the broader context of what you’re trying to do. Everybody wants to boil it down to, ‘This is the way you do functional verification, or this is the way you do timing closure.’ But that’s not the way the real world works. The real world works in the context of what I’m trying to do. What is the best way of solving that problem? Doing it virtually allows you to try different solutions to the problems, optimize the outputs, and come up with the right set of methods and tools that enable people to optimize the outcome.

Wang: There’s definitely that use-case for understanding more about the corporate specific workflow. Some companies have Perl scripts from 10 years ago. They keep using those scripts, but when a new engineer joins the team, it takes them a long time to understand how to generate RTL code using these scripts. How do you put things together? What is the collective corporate knowledge? Some companies have very good documentation. They have an extensive knowledge base. There’s a variety of things that people do. Some companies have their own systems and store things in PDF, HTML, whatever. But how do you keep this corporate knowledge in a context for digital twins, especially for helping new engineers to understand the code. There is a lot of these use cases. How do we help new engineers understand existing flow for design or verification at these companies. Replacing engineers is difficult. When we give demos of chip agents, we can watch the faces of the design and verification engineers. They’re worried about AI taking their jobs, because there are some specific GDS loops that people do, which AI can do really well. But there are also cases like difficult system-level bugs that are very difficult to find. There are opportunities and challenges at the same time.

Young: In regard to the previous comment about human-in-the-loop verification, as much as we want to have a digital twin that can help with all the mundane stuff, maybe the first time they see a problem they get fascinated. But after doing the same thing 100 times, they want to have some automation, some AI agent handling it. But human-in-the loop, at least in our business segment, I don’t think that will ever go away. There are a lot of famous people talking about how AI is going to take over all the engineering jobs, and all the programmers will one day disappear. But they will still need engineers to do the design, maybe at a different abstracted level, maybe at the AI level. But the engineer will not go away.

Wilson: I don’t see this as a job destruction tool. It secures the productivity of teams and will only improve their value. That means that more projects can be funded because they can realistically expect to be able to do architectural lifts in the time provided to serve the market available. The tools do nothing but increase employment. It’s difficult to argue why you should get a raise or a bonus when you’ve been doing a lot of mundane tasks, to keep the lights on, for a very long time. When these things go away, their own experience of their work improves, and their value proposition improves. I don’t see this as job destruction. I see this entirely as a human-in-the-loop tool, to provide focused attention to someone who actually can make a decision. And this AI has been trained on many projects, and it says, ‘This is something that would be impactful.’ Well, it’s Friday and I’m tired, so I’ll go clean this up. That’s the first version of this tool, or this digital twin. If you can do that, and if you’re able to provide some hard evidence that what’s really dictating schedule are these difficult debug sessions by the engineers — or that we aren’t really optimizing how we represent the design, or we’re running tests at the wrong level of hierarchy, or we’re not really optimizing for execution speed — then, when a failure happens, an engineer doesn’t have to reproduce the failure in their own workspace, which is very expensive. Seeing these kinds of time constants show up in the digital twin will be incredibly valuable. And I suspect you’ll find that teams will start to find not just small improvements in total productivity and in their ability to hit market windows, but very large ones. If these tools can help resolve availability of a limited compute cluster, to all the various needs the teams have, in a consistent and understood way — that’s enormously valuable. If I am in a team, and over 20 years we have collected 40,000 directed tests that have been amazingly finding bugs, but it’s got a huge compute footprint, then should the regression be giving me all 40,000 tests pass/fail every three days? Or is there something else that we should do? If the digital twin can start to answer those questions and let project management, and people with budget and authority, to weigh in on what each individual team would like to have and accommodate these tensions, it’s going to be incredibly powerful.

Brian Bailey

(all posts)
Brian Bailey is Technology Editor/EDA for Semiconductor Engineering.

AI Agents Need Goals

Brian Bailey

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Global IC Fabs And Facilities Report: 2024

Impact of Extremely Low Temperatures On The 5nm SRAM Array Size and Performance

EUV’s Future Looks Even Brighter

Startup Funding: Q4 2024

Linear Pluggable Optics Save Energy In Data Centers

Chip Architectures Becoming Much More Complex With Chiplets

What’s Next For Through-Silicon Vias

Interconnects Approach Tipping Point

Sponsors

Recent Comments

About

Navigation

Connect With Us

AI Agents Need Goals

Brian Bailey

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Global IC Fabs And Facilities Report: 2024

Impact of Extremely Low Temperatures On The 5nm SRAM Array Size and Performance

EUV’s Future Looks Even Brighter

Startup Funding: Q4 2024

Linear Pluggable Optics Save Energy In Data Centers

Chip Architectures Becoming Much More Complex With Chiplets

What’s Next For Through-Silicon Vias

Interconnects Approach Tipping Point

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored