Experts at the table, part 1: Have coverage metrics kept pace with design changes and are new capabilities being adopted?
Without adequate coverage metrics and tools, verification engineers would never be able to answer the proverbial question: Are we done yet? But a lot has changed in the design flow since the existing set of metrics was defined. Does it still ensure that the right things get verified, that time is not wasted on things deemed unimportant or a duplication of effort, and can it handle today’s hierarchical development methodologies?
Semiconductor Engineering sat down with Harry Foster, chief scientist at Mentor Graphics; Frank Schirrmeister, group director, product marketing for System Development Suite at Cadence; Vernon Lee, principal R&D engineer at Synopsys and , chief verification architect at Atrenta. What follows are excerpts of that conversation.
SE: There have been significant changes in design over the past two decades while notions of coverage have remained fairly static. How adequate are the existing coverage models?
Foster: While I don’t disagree with that, the industry has matured in adopting what we have and this has taken a long time. Code coverage was being used by less than 50% of the industry back in 2007. Now this is over 70%. This does not mean that they are using it effectively.
Lee: It is surprising because 70% would appear to be very low. We take for granted that people are doing more. You have to have coverage, you have to have checkers and you have to have stimulus. At least 30% don’t agree with one leg of that tripod. Does that imply they are not using random stimulus?
Foster: Constrained random has actually saturated at around 73% adoption. You can’t do this without functional coverage or you are wandering in the dark.
Schirrmeister: If I look at the eras of verification, there was the stone age where chip design used manually driven testbenches with very limited assertions and checking. That was the ’90s. The next era added the hardware verification languages and discussions now depend on scope. At the block level, coverage has been well adopted. The challenge is that complexity has skyrocketed, so now you are looking into the next era of verification where we are trying to find out how to port stimulus across all engines leading us towards software-driven verification and the scope changes from block to sub-system and full chip and chip in a system. Are we there yet with coverage? No.
Lu: I see a similar situation to that and agree that this applies to block level. The bigger question is when we go to sub-system or chip level, is everything broken? You ask people if they do code coverage at the SoC-level and you will find that only 20% do.
Foster: I make the argument that functional coverage is the wrong metric for the system level.
Lee: Do you mean that it is not high-level enough?
Foster: The problem is that functional coverage is very static in the way you describe it. I am now dealing with distributed state machines and I have to worry about states that are dynamic over time and this means that I have to run many simulations and extract a lot of data to find out what is happening from a system perspective. You cannot express that with our existing metrics.
SE: Wasn’t this the problem with code coverage and the reason why functional coverage was created?
Schirrmeister: There is a notion of metric-driven verification. Are customers happy? No. I recently had a customer meeting where they said, ‘I think I am doing too much verification. I don’t know when I am done so, out of fear, I am doing more than I think I should. Help me define what is good enough.’ This includes clearly identifying what doesn’t need to be verified for it to be suitable for a defined application. Dissecting this a little more I see three elements related to coverage: the definition, the execution from abstract concept to chip, and visualizing the data so that I can decide if I am done.
Lee: There is a fourth element, which is when I have 100% coverage, did I define enough or will I find that I was not paying attention to something that was important?
Schirrmeister: We do have ways to identify them and we are improving the ways in which we collect the data but there are still problems with merging that data. With visualization we have tools and there is more to be done but the basics are in place. The engines in the middle are the biggest issue.
SE: How do you go from a specification to functional coverage? It has often been described as being an unnatural process.
Foster: The process is not repeatable and requires a lot of skill.
Lee: How is that different from writing the spec in the first place?
Foster: Right. They are non-trivial. First you have to think. That is the beauty with code coverage — you don’t have to think.
Lee: But that extra effort pays off many-fold.
Foster: But a lot of people don’t understand those skills and they end up creating bad coverage.
Lee: We are seeing the same thing in many verification teams. They try things and see what happens. Then they just spin and spin on that to no end. If you don’t know why you are doing it, why you turned on that switch, then it is probably a waste of effort.
Schirrmeister: Putting the functional piece into the chip level is very hard and does require thinking about the use cases, the scenarios. There are tools being created that help you to figure out the important elements. But yes, you still need to think and define the scenarios that you are going to target and those you are not going to cover.
Lee: There are customers who consider this an important part of their IP. When they have decided what they are waiving or don’t care about, and maintaining that over time is a pain. When to do something, how to do it, what is the process you follow, how much do you trust it? The methodology expert has to decide how much time to invest. Something may seem cheaper now, but that may just be a greedy algorithm that appears to be making more progress than reality.
Lu: If we go back to the IP-level and functional coverage. The problem is that before functional coverage, everything was contained in the test plan. The test plan says that this test covers these features. You have to think through all the stuff and corner cases that you know about, and at the same time you can move them into the test plan. This is not strictly directed test, but you may decide that some are to be covered with direct test and other during random. Then the cover point becomes a measurement, but the significant part of it is in the test plan. Human nature is to question why it is being done twice.
Foster: There is another degree of complexity in there because you have to decompose a test goal into a set of tests and that is where the skill comes in.
Schirrmeister: And the scope sometimes makes it different. It works fairly well at the block level but not at the system level. At the system level you want to make sure that video can be transmitted and decoded at the same time. Decomposing this into what I have to do for each of the blocks is difficult, and without using any of the new system-level tools they had no chance to do this. A key architect of the design would take a couple of weeks to define them for one case. Automation is attempting to catch up.
In part two the panelists discuss the difficulty of adding more automation and some of the problems created by the huge data sets.
Seemed to me that Harry F. was avoiding answers by saying that “everything worth doing is hard” whereas Frank S. was trying to push tools and automation.