In the past, simulation was the only tool available for verification, but today there are many. Balancing the costs and rewards is not always easy.
First-time chip success rates are dropping, primarily due to increased complexity and attempts to cut costs. That means management must take a close look at their verification strategies to determine if they are maximizing the potential of their tools and staff.
Using simulation to demonstrate that a design exhibits a required behavior has been the cornerstone of functional verification since the dawn of the semiconductor age. When designs were small, this was a simple and effective method. As design size grew, it became impossible to write enough tests manually to cover all required behaviors. Testbench technology evolved to automate aspects of this. Again, this was effective for a time, but today it has resulted in a highly complex and inefficient means of verification, as shown by dropping first time success rates for silicon.
“It is well known that as chip complexity has grown over the years, doubling every 18 months, the verification state space has grown exponentially,” says Pete Hardee, group director for product management at Cadence. “Simulation has been the main verification method, but verification of a relatively simple processor core from just five years ago took 1013 simulation cycles, and for GPUs, the number of verification cycles was 1015.”
Today, those numbers are orders of magnitude higher. “Simulation is a dumb approach,” says Marc Swinnen, director of product marketing at Ansys. “You apply a vector and see what it does. Apply another vector, see what that does. There’s little understanding as to what any vector is supposed to do, what you’re shooting for, and hence you’re basically doing a Monte Carlo, hoping to hit all the corners, or trying to steer into those corners.”
Simulation started running out of steam a long time ago. “In the short term, faster simulation speeds were the answer, and this is why we have seen an explosion in the emulation market,” says Prakash Narain, CEO of Real Intent. “More horsepower was needed. At the same time, formal started to find its place, not as something that was marching alongside simulation, but it filled some functional verification holes where simulation struggled to get complete coverage of critical, targeted functional behavior.”
Several verification technologies have been developed that complement simulation, but many verification teams have struggled to successfully incorporate them into their overall methodology. This is due to a number of factors, such as requiring a range of specialists to properly utilize them, the inability to show a combined notion of progress or completion, or that some teams continue to point to empirical success of existing methodologies.
There are several classes of static and formal tools, each of which has its strengths. A new class of tools that leverage AI technology also are starting to emerge. Copilots, for example, can help ensure the soundness of code as it is being developed, or ones that perform a notion of equivalence checking. It is important for teams to continually assess the value each tool is providing. It is also important to understand both the source and type of bugs threatening their success.
A full assessment of a functional verification flow starts with an understanding of why your chip may fail. “The number of operating modes for some products is exploding,” says Chris Mueth, new opportunities business manager at Keysight. “Your product fails if just one of those is not properly verified. How do you bookkeep all of that? How do you put a verification process in place that accounts for all those different combinations of operating modes. How do you bookkeep the performance element of it?”
The way to start is to understand the nature of bugs. That will define the best tools to find them. “Bugs escape into silicon are either due to structural defects in the design, such as overflow/underflow, Xs, FSM issues, dead code, redundant code, or because the design has semantic discrepancies, such as a bug in the functionality of the design,” says Ashish Darbari, CEO of Axiomise. “Formal tools can analyze and find most structural defects automatically through app-driven flows at the time of design bring-up. This saves an enormous amount of functional simulation overhead.”
In theory, formal methods have a major advantage over simulation. “Instead of writing tests to stimulate the design-under-test (DUT) for all the behaviors, and then verify and check for any adverse behaviors, we get the stimulus for free,” says Cadence’s Hardee. “Unless constrained otherwise, the formal tool considers every possible combination of inputs. This is a huge advantage as we move out of the application-specific era. We define the intended DUT behavior as a set of properties. We either prove the property holds in all cases, or get a counter-example for a potential bug.”
No one technology can solve the problem on its own. It requires finding ways to best combine them to solve the problem. “All the intelligence that is needed to contain this complexity lies in the methodologies and the engineers,” says Real Intent’s Narain. “We’ve been throwing more horsepower behind these engines, and this is the problem. An alternative way is to fundamentally ask if a minimally Boolean technology-based approach can deal with this complexity explosion? This is where static approaches enter into the picture, because they are more abstract. The issue you run into with static approaches is they rely on abstraction, and that abstraction is dependent upon customized technology for a customized problem.”
There is no one solution that fits everyone’s problem. “Historical knowledge needs to find a way into the verification environment,” says Benjamin Prautsch, group manager for advanced mixed-signal automation in Fraunhofer IIS’ Engineering of Adaptive Systems Division. “This is hard to grasp by management because a lot of details have to be addressed. Information held by the experts for the test IP, or verification IP, becomes a significant asset. It’s crucial to combine both, and if there’s any error in the field or any fault in the design, it must find a way into design guidelines or additional assertions to improve the overall tests and verification environment.”
The cost of avoidance is huge. “Things are happening so fast these days that if you don’t take action today, three months down the road you may find yourself in a much worse situation,” says Jin Zhang, senior product director for formal verification at Synopsys. “Look at the fast pace of LLMs which are being updated on a weekly basis. You have to step up, you have to retool, and you have to embrace new technologies. There’s just no alternative. Otherwise, you’re going to be left behind.”
The cost of failure is enormous. “Hence the fear factor in making a change in something that appears to work is high,” says Narain. “That is why there continues to be reliance on verification sign-off methods. The opportunity lies with shift left, where you deploy techniques earlier. These will find bugs when they are cheap and quick to correct. While it may not impact the simulation you do today, over a period of time, based upon fewer errors being detected in simulation, the level of reliance on shift left will continue to grow.”
Small errors lead to big problems
It starts with understanding the types of bugs that can threaten success. In many cases, they are found close to tape-out and appear to be highly complex in nature. The reality is that many of them are caused by small errors buried deep in the logic. “A functional bug in a counter is easily caught by running lint and adopting formal property verification,” says Axiomise’s Darbari. “It could cause a DDR performance counter in a SoC to overflow, and in one real example it took a DV team three weeks of emulation debug only to discover that the bug was in the counter.”
Formal’s traditional drawback has been that it is compute-intensive. “That meant it wasn’t used on anything other than small, often control-path dominated, blocks and subsystems,” says Hardee. “Ironically, the same explosion of compute power that’s behind the verification explosion has made formal much more effective. Modern formal tools can verify medium-sized processor cores, and large processors if split into subsystems first, then end-to-end properties created at the top level using divide/conquer approaches. Formal also can fully verify complex math blocks in minutes or hours. For example, 32-bit integer multipliers — a typical building block for many ALUs or math coprocessors for GPU or AI accelerators — have 264 possible input combinations. Simulation for every combination would be infeasible.”
And it is not just functionality that can be targeted. “We deployed a novel area analysis app to detect redundant flops and gates in silicon that were burning power,” says Darbari. “By using formal property verification, the user was able to perform a fast analysis of the entire silicon SoC without needing any testbench or tests. The reported results were staggering in some cases, especially when considering that these designs have been previously verified by simulation.”
The role of formal has been changing. “People had been using property verification to verify simple local assertions,” says Synopsys’ Zhang. “If you have a finite state machine, you can verify state transitions. Those are what we call local properties. But in more recent years, the focus has shifted to verify end-to-end properties, where you’re writing a property for the output and reasoning from the inputs. They are more complex properties, and as a result, it’s a lot harder to verify. It requires a lot of techniques, but it can catch real design bugs, more corner case bugs, than if you were just to sprinkle your design with local properties.”
Management needs to measure the cost of verification precisely. “Without measurement, the whole argument depends on empirical success,” says Darbari. “This can be hit-or-miss, and very much depends on factors such as skill, mentorship, management, and experience. If management measures the ROI by measuring verification costs, including tools, human costs, costs of TB bring-up, costs of TB run and debug, and the cost of missed bugs — the results would be a good indicator to suggest what is working and what is not. Formal methods are not a panacea, and simulation must be exercised to validate formal assumptions, but the right combination of these two techniques provides amazing results.”
Management is listening. “A few years ago, the game used to be to sell formal to management by finding bugs that the simulation team had missed,” says Hardee. “This is still sometimes necessary in organizations yet to buy into the value of formal, but those are dwindling in number. The game has moved to management putting trust in small teams of formal engineers, led by a formal champion, who take on the complete verification of highly complex blocks and show more conclusive verification results in a fraction of the time taken by simulation teams for comparable blocks.”
Looking toward AI
AI technology is moving fast. If you see it published at a conference, it is probably also a generation or two outdated at that point. “AI-assisted engineering will potentially generate a lot of design and verification IP for you,” says Keysight’s Mueth. “There’s going to be a period of time where you have to get comfortable with that and learn to trust the results. That will take a while. But in a paradigm like that, the effort shifts to the front end of the process where you are specifying the requirements and the parameters for the AI engine to go do work. The design process is a little bit different, interactive with the machine learning engines. The important work will be up front and allow the engines to do what they are supposed to do.”
Verification has always been about comparing two models and identifying the differences between them. “Questions are being raised about how LLMs and other AI methods can be incorporated into the verification methodology,” says Fraunhofer’s Prautsch. “One approach is to look for differences between your requirements document and the verification test bench to identify gaps. This is just one aspect to it, and we would be excited to see the development of a tool like this that could be introduced to support verification. This is not trying to automate everything, but to support the verification engineer and keep track of everything.”
Progress is happening already. “We’ve been using generative AI to create SystemVerilog assertions,” says Zhang. “A first-level barrier for formal adoption is you have to create those properties. If GenAI can help you, that reduces one barrier of entry. And this technology is in production usage by some of our customers. Based on your design, based on the test plan, we can create assertions for you, and that’s a big help and productivity gain.”
Copilots already are being used by some companies. “They make the initial coding of the design — the human typing effort and things like that — easier,” says Narain. “They also bring variability and their own issues with the accuracy of the work that they produce. What becomes even more critical is verification, to make sure that whatever methods are used to create the design are fully verified. There are opportunities to incorporate AI-based technologies, and again, it’s about shift left. If there is a possibility that copilots and AI-based technologies can enable some use models that can generate shift left in verification, that will be a valuable step forward.”
As LLM generation quality improves, everyone will see a productivity gain. “I see GenAI as a game changer to democratize formal verification,” says Zhang. “Before, it was formal engineers doing formal verification. But slowly we’re seeing companies realize that they must do this early in the flow. It needs to be done by designers. In the past, designers typically extracted a few formal properties, but they don’t write a lot of assertions. This is a way that they can add more assertions to the design and really get the benefits from it. This is definitely a technology that can help make formal even more adopted within the industry.”
Adopting formal
The adoption of formal methods is not about buying a tool and inserting it into a flow. There are many ways in which formal can help with specific types of problem. Hardee divides it into three categories:
It is important to understand the division between block level, sub-system and full system, where the role of formal changes. “At the IP level, or subsystem level, teams should be using static methods, such as lint and formal, rather than simulation,” says Zhang. “When you get to the system, SoC level, do your verification using dynamic methods because that’s where formal doesn’t scale.”
While simulation may be required at the SoC level, that doesn’t mean formal is excluded. “Where formal has been used in a planned way, it has yielded amazing ROI,” says Darbari. “Engineers and managers are all sold into investing in formal as it finds bugs quicker, avoids respins, and helps optimize simulation. The problem largely is still that engineering teams do not have adequate training in writing assertions for formal. When used by seasoned experts, formal is capable of signing off on system- and subsystem-level designs, full processor for functional, safety, as well as some kind of security verification.”
Designs also are changing, requiring a change in methodology. “Designs are not application-specific anymore, and at the time of design, it’s just not possible to define all of the workloads these chips will be called upon to process,” says Hardee. “The upshot of this is that we have to verify for every eventuality. It was hard enough to define when we were done in the application-specific era — when, in theory, it was possible to define a verification plan consisting of tests for every specified eventuality — and measure coverage against that plan. Now, even with constrained-random methods, we never know when we have enough simulation tests, so we cannot know when we are done.”
Change takes time. “There are many techniques that have not been adopted by design and verification teams to improve their efficiency productivity,” says Zhang. “It takes time to retool their verification methodology, getting some of the latest pieces in their flow, coupled with the complexity and the shorter time to address this. Unfortunately, they have to evaluate their current methodology, see what’s available today, and see how they can leverage those new technologies to improve their flow. With GenAI, I wouldn’t be surprised if in three to five years we find the flow to be quite different. There will be a lot more automation, generation, and we will see agentic flows. Maybe we will see a lot more success with silicon pass rates once all the automation is built in.”
Related Reading
Analog Creates Ripples in Digital Verification
While analog and digital verification efforts have been essentially separated, closer integration is resulting in a rethinking of standards.
Improving Verification Methodologies
The verification problem space is outpacing the speed of the tools, placing an increasing burden on verification methodologies and automation improvements.
Leave a Reply