Is it possible to make a design change and not have to rerun the entire regression suite?
Verification consumes more time and resources than design, and yet little headway is being made to optimize it.
The reasons are complex, and there are more questions than there are answers. For example, what is the minimum verification required to gain confidence in a design change? How can you minimize the cost of finding out that the change was bad, or that it had unintended consequences?
In the design flow, tools and methodologies have been created to minimize the chance of problems, particularly as you approach tape-out, by making safe, non-optimal corrections. But there are no such tools or methodologies for verification.
“You have tons of resources focused on verification,” says Simon Davidmann, founder and CEO for Imperas Software. “The goal is to minimize what design changes you make because with the technologies that are available today, when you change something in the design, you need to rerun everything.”
There are many moving parts to the problem, but they all involve difficult questions. For example, it may sound simple to say that you should run the tests that produce the highest coverage first, but is all coverage equal and how do you know that the coverage produced by a test in the past will be the same after the design change is made?
There are many questions that play a role in this optimization process. Should dynamic verification be the front line, or will formal technologies find problems faster? As the role of emulation in verification grows, how do you get back to being able to emulate quickly? Will machine learning (ML) be able to help identify a connection between RTL and testcases? Do aspects of the design or verification methodology need to change?
While it is easy to concentrate on functional aspects of verification, this is only the tip of the iceberg. What about performance, power, safety and security? How about other impacts such as signal and power integrity? These are real issues every company must face.
Continuous integration
Many companies have methodologies in place to deal with change, and a primary objective is to find out if there is a problem that can be fixed quickly. “The purpose of incremental verification is not to fully verify a design,” says Mike Thompson, director of engineering for the OpenHW Group. “Rather, the purpose is to ensure stability of the design such that proposed changes to the project are vetted before they are integrated, and the maturity of the design continues to make forward progress.”
As the design matures, the extensiveness of the test run may change. “You don’t want to be running all of the checks at every single stage of your design,” says Kiran Vittal, product marketing director for verification at Synopsys. “Over the years, we have developed methodologies such that if you’re in the early stages of RTL, you can run one kind of checks, and if you’re in a later sign-off or latest stage of RTL, where you really want to hand it off, you can run a different set of checks. This is applicable at both IP or subsystem, and in the SoC, because at all stages you’re making changes and you need some incremental verification without doing an exhaustive verification.”
This is especially true for new designs. “Verification needs to be elastic,” says Imperas’ Davidmann. “In the beginning you don’t have many tests, but almost every week the number of tests is going to double, so you need to have more and more verification resources. At some point you have to ship your product, and some people then start testing silicon rather than doing it in simulation. But there are no silver bullets. You have to keep throwing more and more resources at it.”
That involves a lot of value judgements. “The team is challenged to ensure that the continuous integration (CI) regression, which will eventually entail multiple static checks, as well as tens of simulation runs, is comprehensive enough to adequately exercise the design and yet small enough to complete in a reasonable time,” says OpenHW’s Thompson. “Today, this problem is managed in an ad hoc fashion by looking at test plans and coverage reports to select the ‘best’ set of tests from a full regression to add to the CI regression. This can be a laborious task, which is ripe for automation. Data-mining the nightly regressions to extract information about code changes, coverage results, and recent bug locality can be used to automate the selection of tests to run in CI.”
Selecting the best set of tests is not easy. “You want to identify the tests that are really going to give you the most bang for the buck,” says Paul Graykowski, senior technical marketing manager for Arteris IP. “Every time you do some change that is potentially far reaching, run these tests and see where things are at. This also may involve looking at how long those tests need to run, and then making some educated guesses about how to best balance that.”
The problem is there is a fundamental piece of technology missing. “Ideally, you want something that says, ‘If I change this bit of RTL, then you need to run those tests,'” says Davidmann. “The industry does the opposite. It has developed technology that says, ‘If we run our regressions and a test fails, we can reverse back through the check-ins and locate the three lines of code that were changed and caused that test to break.’ That’s the opposite cone.”
Testbench relevance
While some companies define specific tests that target particular aspects of a design, most verification relies on constrained random test pattern generation where the relevance of a particular testbench has to be determined after it has been run. “Those two sources have to play together,” says Arteris’ Graykowski. “One set of tests is likely to target the overall functionality of the component that you’re testing. More than likely, they are identified by a couple of different processes. It might be the architect saying, ‘I want to make sure this key functionality is always tested.’ And there certainly needs to be some knowledge from senior-level verification members who know what things are touching the key areas of the chip. Some of that is just knowing the architecture and knowing how the different blocks interact. The other side of it is our old faithful functional coverage. Assuming you have a good functional coverage model, you might identify which tests are giving you the most mutually exclusive coverage.”
That latter method has to be iterative. “After the first few runs, you can intelligently select the tests that have the highest impact in the design, for whatever reason,” says Synopsys’ Vittal. “It is probably exciting the highest number of nodes, and if you can intelligently reorder the tests based on a certain known behavior, you don’t have to run 1,000 tests. Instead, you may be able to pick the top 100 tests and run those. Maybe that can help you catch some of these issues quicker, instead of running exhaustively all 1,000 tests.”
This is a popular strategy being addressed by the EDA companies. “We have implemented analysis technology that help our customers identify which workloads are more relevant for this application in a particular vertical market segment,” says Vijay Chobisa, director for emulation product management at Siemens EDA. “Let’s say they have 50 workloads, and out of those workloads, 5 of them are shown to be the most relevant. When making these incremental changes, you run those 5 workloads. If time permits, you run them all.”
But this approach has issues. “The ability to relate subsets of verification performed for the full design to incremental design changes, such that verification can be appropriately targeted, is difficult to address because of the possibility of unexpected consequences,” says Daniel Schostak, architect and fellow at Arm. “However, data science and ML can help, as they can be used to build up models of the types of bugs in an area of a design that are likely to be caused, and the best tests for exercising specific areas of the design and finding specific types of bugs.”
A similar problem is that no company ever retires a test once it has been placed in the regression suite. “A large processor company has so many legacy tests,” says Graykowski. “Many are written in archaic languages, but they always say these have to run. Once you have this baggage, it just keeps tagging along with you. And you’re just lugging it around and regressions go from running overnight to over a weekend. As part of the verification process, you constantly have to be looking at the quality of your results from your regression runs and making sure you aren’t wasting cycles. You really need to have someone who’s in charge of optimizing the test list.”
Design for verification
For decades, it has been shown that making small changes or restrictions to the design process can yield big results in some other process. For example, design for test was once considered to be too expensive, but today no design relies on older test methodologies. Instead, some form of scan methodology is built in. But what about design for verification? What would that entail?
“Why did we make the design change to begin with?” asks Stelios Diamantidis, senior director of artificial intelligence solutions at Synopsys. “Design for verifiability has been around as a concept for a long time, but we really don’t have a practical design for verifiability approach. If I was to introduce a change with high risk, which would require me to rerun all verification, I need to be intentional about it. If there’s a lower risk alternative, my strategy could change and I could become more aggressive in terms of reusing prior verification data, extrapolating, and moving forward. At some point, I will regress the whole system. But meanwhile, I can make swift progress with what I have, and high confidence that the solution was a low risk to begin with.”
The problem is that some designs are difficult to partition and encapsulate well enough. “You need to try and limit the effect of any changes,” says Davidmann. “But the challenge, when it comes to processors, is it is very hard to fix small bits and not touch everything. If you find an issue in the pipeline, it’s right in the middle. For RISC-V verification, it is not easy to ring fence bits that you don’t change. So the challenge is how to make your verification efficient.”
This often requires a different approach. “Although functional testing still dominates, the use of static checks in CI is growing,” says Thompson. “More and more, we see CI environments that involve static checks such as linting, clock domain crossings, synthesis, and coding style. One major driver for this shift is cost. Static checks are typically less expensive in terms of compute and licensing.”
And they can be more complete. “In simulation, it may be hard to cover the entire state space of inputs and study their impacts on the outputs to get 100% coverage for verification,” says Bipul Talukdar, director of application engineering for SmartDV. “Formal verification can help exercise the entire state space and quickly produce counter-examples for changed behaviors. The key to the formal verification approach is to encode the changed behaviors in properties.”
This approach covers more than just functionality. “Design and verification is not just RTL and testbench. There is more to it,” says Vittal. “You have design constraints for implementation, and you can verify those upfront. You can even catch exception bugs like false paths, and multi-cycle paths. If you’re writing UPF for power intent, you don’t have to wait until you do power-aware simulation. There are static tools to do low power UPF checking. Similarly, there are static tools used for SDC validation and other areas.”
Another approach to encapsulation is by using hierarchy. “Re-using building blocks will increase the verification level and design security of the whole system,” says Björn Zeugmann, group manager for the Integrated Sensor Electronics research group at Fraunhofer IIS’ Engineering of Adaptive Systems Division. “Smaller blocks, with less complexity, are easier to verify, and also better to measure. Using small incremental steps when designing a system lets the designer verify small changes instead of a new complete system. Nevertheless, it is important to keep the verification of the whole system in mind and review all testbenches regularly. Re-used blocks and their testbenches should be reviewed regularly to avoid failures and update them using the latest verification methods and design experiences.”
This is, in essence, the principle behind IP reuse. “There is always a core which is being re-used,” says Frank Schirrmeister, senior group director for Solutions & Ecosystem at Cadence. “If you take that as a starting point, there is a lot of pre-verification, so you don’t want to start from scratch and do a complete redesign. On top of that, you make extensions. You may extend a processor with more instructions, and so forth. By virtue of having extended it, you also need to extend the verification that goes around it.”
This extends to buying third-party IP blocks, as well. “A lot of times they’re just going to put that in their system and trust that the vendor has done their due diligence,” says Graykowski. “They may have their own set of tests that are really important to them, especially if they’re doing some subsystem-level testing, but what they’re really testing is to make sure all the connectivity between the different boxes is good and everything’s getting where it’s supposed to get.”
Verification ECOs
There are some tools that have taken the concept of a design ECO and used that to get you back to verification as quickly as possible. “It can take a long time to recompile a design for emulation,” says Siemens’ Chobisa. “We have taken the ECO concept, where you are making small and localized changes and you’re not impacting your entire design, to just re-synthesize and regenerate the bit file for that particular portion of your chip. This enables small and localized changes and improves turnaround efficiency in two ways. You can now validate your change in tens of minutes, but also think of the compute resources. If you need to recompile these gigantic designs, they use a large number of compute resources, and that is not practical.”
There are other tools that are taking an incremental approach to verification. “If you had an IR drop problem that you want to alleviate, you risk damaging the timing,” says Marc Swinnen, product marketing director for the semiconductor business unit at Ansys. “Timing is so carefully constructed that nobody wanted to mess with the design to fix IR drop problems. Traditionally, it was seen as a difficult incremental problem because the power network is a network with many paths to it. It’s hard to predict exactly how the currents will go, and a change in the network in one place can redistribute the currents in other places. Today, we can almost instantaneously provide an incremental update of what the voltage drop picture will look like. You can make incremental changes and keep track to make sure that timing is not damaged while fixing the IR drop problem.”
Conclusion
Verification will continue to evolve. “We’re still at the point where we have to get our hands dirty,” says Graykowski. “We have to be careful that we don’t create so many tests that we can never run them all. We have to make sure we have efficiency in the test scenarios that we write, and that we’re really achieving what we want.”
A lot of people are looking toward AI and ML in the hopes that it will be able to provide some insights into many of these issues. It is starting to make an impact, but it has a long way to go.
Editor’s Note: Next month, Semiconductor Engineering will look at the progress being made in applying machine learning to the verification process.
Wouldn’t simplifying the design process also simplify the validation process? A typical Turing Machine uses a clock and is built using memory structures, Boolean structures and State machines structures representing a large ‘state space’ to be validated. A Flowpro Machine is clock-less and built using an atomic Decision structure and an atomic Action structure representing a much smaller ‘event space’ to be validated. Or, am I missing something?