AI In Data Management Has Limits

Trusting which data to use and what can be deleted still requires human oversight; startups are at a disadvantage.

popularity

AI algorithms are being integrated into a growing number of EDA tools to automate different aspects of data management, but they also are forcing discussions about just how much decision-making should be turned over to machines and when that should happen.

The ability of AI to sort through enormous amounts of design data to find patterns, both good and bad, is well recognized at this point. There is so much data being generated by the design process that it is well beyond the capability of any engineer to keep track of, and too time-consuming and resource-intensive to manage without some type of AI/ML. That makes the economics attractive enough to use it for an increasing number of tasks.

“AI is making choices for you,” said Jim Schultz, product marketing manager at Synopsys. “For instance, AI design space optimization (DSO) is an exploration of how to improve the quality of results for the design. It’s trying to do things that a user wouldn’t think to do. What you’re doing is burning compute resources, and that can turn into real dollars. In DSO, you’re trying a bunch of things that a human wouldn’t try. The only danger there is that you’re wasting resources, and it may not find a better solution in a fraction of the time. However, in a competitive marketplace, people are willing to take that risk.”

The degree of risk-taking, however, is still an evolving question. Good data management can make design workflows simpler and more time-efficient, with more efficient verification, and AI plays an important and growing role there. But its application so far has been restricted largely to exploratory or co-pilot types of roles. As of today, the technology cannot be trusted to make large-scale decisions about what data to keep and what to delete. And for startups, which frequently are at the leading edge of innovation in chip design, the sheer volume of data available for training has proven to be a hurdle.

The value of data
Nevertheless, AI’s footprint continues to widen. “Its use previously was largely focused on a certain discipline of design, whether it be an aspect of visual design or analog design, with AI and data management trying to help enable that from almost a generative AI type of application,” said Simon Rance, general manager and business unit leader for process and data management at Keysight. “But as those designs cross boundaries and go from design into verification or even formal verification, if it’s digital — like doing power, performance and area tradeoffs when you go across those disciplines or domains from the design to the verification or test — there’s more of a handoff of the data. There’s more collaboration of the data, and then the insights, and then an almost iterative loop.”

The data itself is more efficient to use for verification test cases than via fuzzing or using random data. But today, multiple approaches are being used to ensure everything will work as expected.

Mike Borza, a Synopsys scientist, said fuzzing allows program users to “predict what the answer should be, and you can measure what the answer the device you’re building produced, and compare the two answers. If you get the same answer, that’s a good test case. If they disagree, then you’ve got a failed test. That’s a thorough way to test something, but it takes a long time. If you’ve got something that has a billion states, even if they’re only binary, that’s two to a billion tests you need to perform if you’re going to cover the entire space. That’s not an efficient way to test. It’s just a thorough way to test. To make it more efficient, you try to limit the test cases to the edges where you might trigger erroneous results. AI has turned out to be a very good way to try to find those kinds of test cases.”

Using the data to speed up the testing and verification stage can result in “faster engines, faster engineers, and fewer workloads, and achieving step function gains in productivity across creation, regression, analysis, coverage, and debug domains,” said Mark Olen, product management director for design verification at Siemens EDA. “AI/ML can be particularly helpful in managing data generated by IC development tools from requirements and implementation through verification and validation. [You can] capture data from engines throughout the IC development process, and then utilize multiple AI/ML engines to analyze the data to automate generation, exploration, acceleration, prediction, prescription, and optimization.”

For the data to be useful, however, it can’t just be mashed together. Here too, AI can be instrumental in organization, where some models are used to analyze existing designs and identify which data is the most useful.

“The data size and data set explode so quickly that it’s almost like, ‘How do I clean the house?’ Which room do I start on first?” said Keysight’s Rance. “What would give me the biggest bang for the buck? That’s where AI in the data management tools is helping identify really good data sets and what data sets to keep — but almost more importantly, which data sets to delete, because everyone’s afraid of deleting useful data. AI is helping the existing data management solutions do that in a much more granular, almost real-time fashion.”

Relying on AI to make vital decisions

But if AI is making the critical call on what data is vital to keep, that raises important questions about how the programs are making those decisions. ML algorithms can be opaque by design, which renders the decision-making process about what data to keep and what to delete a matter of trust. As a result, many companies have opted to at least partly take that decision away from AI. According to Schultz, what ultimately gets deleted is up to each individual company’s retention policy, which usually specifies when data should be tossed out.

“Saving everything for a year isn’t really useful,” Schultz said. “And certainly not two years, because by then you’ve moved on to another chip. The way AI comes in is that it can actually look at the usage of data. It can predict and say, ‘Never in the history of you keeping this data have you ever mined this type of data, so we’re going to make a decision and just delete that data.’ Yes, there is an inherent danger in that you’re deleting data you would otherwise keep. But the kind of data we’re talking about is part of the intermediate steps in the design process. Nobody will ever throw out the final tape-out data, the actual design, because you will have to return to it. It’s usually the mountains of data that get produced with the many experiments, the many runs, that were done to improve it.”

Who ultimately makes the decision about data retention can vary. “Our implementation tool does not automatically delete data,” noted Schultz. “That’s something the design teams come up with a methodology for doing. There are commercial tools which do data management, data analytics, that can take that data, mine that data, then save it in a much smaller format. It’s still large, but we’re picking out the metrics and destroying the metric so that you could get rid of design data. That type of tool has a user policy that says, ‘After this much time, delete these types of data.’ Where AI comes into place is now AI can start to analyze and see how the user is using the data, then it can make decisions based on that.”

The goal is for more of the decision-making to be made by AI in the future, but the technology will need to evolve to make that happen. AI will play more of a supportive role until it reaches the point where it can be trusted to make those sorts of calls.

“That’s a transformation we haven’t seen enough of yet,” Rance said. “We typically have the design team, engineers, architects, and verification engineers. What we’re not seeing enough of are data engineers embedded in the design team who are not solely doing the design or the verification, but are enabling almost a new expansion of CAD teams. What needs to happen here with these data engineers, where they’re working with these systems, is to make sure that the data is the right data, and the old data is managed accordingly. Design engineers are just trying to get the design done, or they’re trying to get it verified. They’re not in the mindset of asking, ‘Hey, what data do I keep? What do I delete? What’s the consequence if I delete the wrong thing?’”

Unexpected challenges
While AI can help organize data, there still can be hiccups in workflow. One of the major challenges many designers face is that different teams may be using different data management systems even within the same company, leading to data being kept in different formats. That kind of siloing is necessary, but it also can be frustrating. While it avoids too many hands having access, it causes crowding, which can lead to problems such as data corruption.

“Teams don’t have access to other successive data,” Rance said. “We’ve been looking at going beyond just the data management portion of it to how to manage that data across that lifecycle of engineering so that these teams, no matter where the data and the data management systems are, sit underneath it. That way you have a holistic view across the entire flow, from design, verification, and test, where that data can be shared but it’s still secure. Other parts of the teams can’t corrupt or manipulate data they’re not supposed to. That’s where we’re seeing that bigger picture. It’s gone from the level of siloed data management teams and data management tools platforms to a more holistic engineering lifecycle management for the data.”

Schultz agreed that AI’s ability to dictate data management permissions needs to be limited. “Let’s say you have one team that’s working on a crown jewel design, and you don’t want anybody else in the company who are not part of this elite team to have access to their data,” he said. “You don’t want that to be common knowledge across the company. You try to minimize access so you don’t have data leaks. That’s an area that can be a problem.”

An even bigger issue has arisen with the tools themselves. Many tools are free and open-source, which can impact back-end support. While there are higher-end, premium data management systems, “most of them are geared toward software and gaming infrastructure designs,” Rance said. “There are very few that are specifically tied to engineering lifecycles.”

In addition, AI in data management is iterative, which means the tools are often only as good as the data being used to train them. While Schultz noted that many of the programs do come with a degree of programming that’s pre-trained, that paradigm heavily favors larger companies with existing product lines over startups with few or no existing libraries of data. As a result, those just entering the marketplace may find it necessary to use even more compute to maintain larger amounts of data. Even bad data could be valuable for them as that, too, can be useful in helping bring their AI data management models up to speed. But that situation is only temporary.

“When the AI does data management, you end up using more and more data, meaning you end up saving more and more data and deleting less and less,” said Suhail Saif, principal product manager at Ansys. “To train these AI engines, you need a lot of data. And you need a lot of different data — not only the good optimization points. You also need some bad ones to tell the AI engine what is not good. This scales up very well. When you move on to doing 50 designs instead of 5, the AI engine is trained so well that eventually, by using this data that was not deleted and stored, your data retention abilities become more efficient as you move on to a larger number of designs.”

Conclusion
AI has become a valuable resource in data management, enabling more efficient testing and verification, but human guidance is still necessary for big picture decision-making. This is especially true when it comes to decisions about what data it’s necessary to keep and what can be discarded.

Implementing AI in data management comes with some challenges, notably the need for large amounts of training material. Ironically, this can result in some companies needing to maintain even bad data, as that can prove valuable for training purposes. However, as the AI learns, it promises to grow more efficient and require less data to improve once the algorithm is trained.



Leave a Reply


(Note: This name will be displayed publicly)