It is possible to do a much better verification job, using less energy, faster, but it means we have to throw away today’s unethical coverage metrics.
How many times have you heard statements such as, “The verification task quadruples when the design size doubles?” The implication is that every register bit that is created has doubled the state space of the design. It gives the impression that complete verification is hopeless, and because of that little progress has been made in coming up with real coverage metrics.
When constrained random was first conceived, it needed a way to measure the effectiveness of the tests that were being created. The first applications, to the best of my knowledge, were datacom applications with circuits like switches and routers. Such designs had a number of inputs and a number of outputs, and you wanted to make sure that traffic could go from each of the inputs to each of the outputs. The creation of coverage on those ports, and then doing a cross between them, provided a reasonable goal.
This was a useful abstraction of the function of the design, but as it was applied to an ever-broadening set of designs, that abstraction became more and more tenuous. It also had the implication that those coverage points were directly associated with a checker of some description, because by implication if the cover point appeared on the output, it had an existence proof. Scoreboards were added to identify packets that went missing, and checkers performed rudimentary verification that contents had not been corrupted along the way.
Today, the scoreboard essentially has become a partial abstract model against which results can be checked, and cover points are proxies for ways to identify that certain parts of the design have been activated. There is no connection between them or the stimulus generation function. It should be no wonder that verification has become as inefficient as it has, and that it is nigh impossible to ever remove a test case from a regression suite. Who knows what a test case actually does, or if anything of use is ever driven to a monitored point?
Machine learning may help us find a subset of tests that are able to reach the same level of ‘cover point’ coverage faster, and that is good. But we still do not know if tests that are removed were actually putting the design into different states, or whether they were working correctly. It does mean that the tests were incomplete in some ways because they did not provide any evidence that they were unique.
While that may sound unlikely, I worked with a startup many years ago that looked at the effectiveness of test suites. It injected faults into the design and then looked to see if tests failed. If no tests failed, it showed a deficiency in the verification methodology. During their early days, many IP developers or users were surprised at how ineffective test suites were and how many checkers were missing or not enabled. That tool (Certitude) was acquired by Synopsys and is still available.
But back to the original statement. It is blatantly wrong. It assumes that every register bit is dependent on every other one, and this is a very long way from the truth. Let’s start with a very simple example. Consider a 32-bit register. Assume there are no manufacturing defects, such that it works logically as it should. Take that register out of context for just a second. The bits of the register have no dependence on each other – none. They are all independent. Some would call that a trivially parallel design. It can be verified with just a few test vectors showing that each bit can store the values 1 and 0.
It is only when circuitry is attached between the bits that dependencies are formed. Consider a parity bit. It is dependent on all of the bits of the register, and verification has to be performed to show that it is calculated correctly under all circumstances. This is a fairly trivial task for a formal tool. With that done, that bit is now a faithful abstraction of the function it performs. When that parity bit is now used is subsequent circuitry, it brings in no interdependence on the individual bits of the register. Any combination of bits that triggers the parity value will do.
A lot of combinatorial dependencies within a design can and should be verified using formal methods, but then the important thing is that abstractions for them should be used when they are placed into the next level of hierarchy. There is no point repeatedly simulating the details of the implementation because they are known to function correctly. Abstraction and encapsulation should allow the periphery to define any sequential dependence they have — and these can then be verified at the next level.
In many cases, sequential dependencies are also contained and can be encapsulated, while others are exposed to the next level of integration. Consider a function that has an unknown latency at its interface. Its functionality should be proven with an exported sequential property that says it is guaranteed to provide a correct result within two to four clock cycles. Places where it is now used need to show they can support that sequential property.
Because we have failed to create suitable abstractions, we essentially have forced verification to be performed as a flat function, which does not scale. Without those abstractions, we cannot understand, through cover points, what the functionality of the design is. We thus cannot do encapsulations, cannot find where the actual dependencies are, and at what levels they can be isolated and simplified.
Constrained random was a boon to the EDA industry at the time because it meant that users needed more simulation seats. They knew it didn’t scale and that it would be increasingly inefficient, but that didn’t matter because it did advance the amount of verification that could be performed. The industry lapped it up, and they were willing to pay the price. Today, simulation has become so ineffective that many are giving up on it, meaning that EDA companies now have the incentive to create better verification tools that will allow those simulators to become useful again.
I hope they are not just looking at ML as the only answer, but will also tackle the real problem — the coverage metrics and verification abstractions. Once they have been definitively defined, efficient automation can be added that will ensure that every simulation cycle is useful and ethical.
Thank you Brian, agreed on all points. These articles are a critical service to the industry.