Experts at the table, part 3: Panelists discuss new types of coverage required for each level of verification and the lack of metrics for some.
Without adequate coverage metrics and tools, verification engineers would never be able to answer the proverbial question: are we done yet? But a lot has changed in the design flow since the existing set of metrics was defined. Does it still ensure that the right things get verified, that time is not wasted on things deemed unimportant or a duplication of effort, and can it handle today’s hierarchical development methodologies? Semiconductor Engineering sat down with Harry Foster, chief scientist at Mentor Graphics; Frank Schirrmeister, group director, product marketing for System Development Suite at Cadence; Vernon Lee, principal R&D engineer at Synopsys and Yuan Lu, chief verification architect at Atrenta. In part one, the panelists talked about the suitability of metrics, adoption rates and the need to think about how each are used. Part two examined the question of whether too much verification is being done and if big data analytics could help. What follows are excerpts of that conversation.
SE: Why is the Portable Stimulus effort concentrating on stimulus when verification is what matters?
Foster: When a customer finds a bug in post silicon, how can it be abstracted in a way that it can run in another engine?
Schirrmeister: All aspects are important, but the effort was initially targeting where the users have pain and that they have to rewrite what they used in the next step. If you can define a scenario at the system level, the tool can then take all of the constraints from the individual components and can obtain 100% coverage for that use case by creating the necessary tests. This often beyond what any single person can understand and today the job is given to the architects to come up with the use cases.
Lee: Even 100% coverage at that level does not have the same denominator as it would at the lower level. You have to filter that up as well.
Schirrmeister: We need a new term for this type of coverage.
SE: Will tools then be able to decompose a scenario into the test necessary for the individual blocks?
Lee: Lu mentioned observing a useful test from the block level and then combining two of them together in some way to get a higher level behavior.
Lu: Emulation coverage would seem to allow the emulator to be used in different ways. What is the state of the art in this area?
Foster: Traditional coverage metrics work there, but those may not be the right metrics.
Lee: You can bend over backwards to map a line of code into an emulator but it doesn’t mean anything.
Lu: I saw data that says putting code coverage into an emulator slowed it down by a factor a two and reduced the capacity by half. Doesn’t this make it unusable?
Schirrmeister: There are customers doing this so it is not right to call it unusable. We also have customers doing coverage in simulation, both code and functional, and yes if you put more checkers and assertions into the emulator, it will go slower because they need to be executed when they trigger and some of them are not very emulator friendly. But automation can make it happen. We then have the ability to merge that coverage from simulation and emulation.
Foster: Customers know and understand things and so naturally they want this running in all of the environments. It takes time before they ask if they should be doing it differently. They do it, but I am not convinced it is the right thing.
Schirrmeister: You have something that is slow and accurate, something that is fast and hard to debug and then there is emulation. It sits in between and is reasonably fast and great for debug, but now you have several engines to deal with. So if you are trying to accelerate something, then it is the right thing to do. The faster speed enables you to deal with things that may be very difficult in simulation. You can run real software cases which was not possible in the past.
Lee: Code coverage gets saturated very quickly for complicated scenarios.
Foster: When I partition a verification problem, I do it in a way that allows me to decide what I am going to concentrate on at each level. I have clear objectives and I don’t want to repeat that at every level.
Schirrmeister: There are times when the emulator runs as a simulation accelerator, and other times when it is used in-circuit. There are different use cases and different ways in which each solution is being used.
Lee: You are concentrating on things beyond simple structure. Functional coverage makes sense even if code coverage doesn’t.
SE: Functionality is one aspect of a system. There is also performance, power, security and these aspects that are becoming equally important. What progress is being made in defining coverage for those?
Lee: What does coverage for performance mean? You can define assertions that say an operation could not take longer than this.
Schirrmeister: There are two aspects to it. Performance analysis is done at the interconnect level. These can be complex and have hundreds of parameters meaning that a human has problems finding the right parameters. They do it today using SystemC cycle accurate models, or they spawn off large numbers of simulations in parallel to collect the data. But for performance validation, you have to define the corner cases first and then put the checkers into the system to make sure that the bandwidth was never exceeded or that the queue depth never got beyond a certain figure.
SE: Many metrics in this area are not instantaneous in nature. They might be averages.
Foster: Statistical coverage is become important and here you need to sample over time and come to some conclusion.
Lu: We cannot fully express performance in the metrics we have today, which are based on logic. The only thing we can do is to create a program to describe the coverage. Even if we look at various signals, it takes domain knowledge to understand what it means. Existing coverage metrics do not give me this information.
Schirrmeister: It is becoming more important. When a chip is put into a data center, you need a certain uptime. Any downtime caused by queues with the wrong dimension costs you money. People are willing to add space into silicon that put traces into the silicon so that they can be put into a performance validation mode and see why certain cases do not follow performance expectations. Consider the case where the filters in a computer are clogged. This can cause thermal issues that cause the cores to be slowed down. You won’t see this in simulation. This is not something wrong with the design. These are important but very hard to find.
Lee: Coverage in the sense that you are collecting the data over many runs.
Foster: Goes back to data analytics. It takes a lot of data before you can make sense of it. Performance issues are often brought on by arbitration issues and these can be difficult to identify.
Lu: Performance issues, in my experience, result in a functional issue somewhere.
Schirrmeister: Scenario coverage is the next frontier where software comes in. It assumes that you have done things at the lower level already. You may not be able to reach many issues from that level.
Foster: Many tests involve an OS and this prevents many things from happening.
Lu: Not only is there software involved in the coverage, but it need to be reflected to them.
Lee: And there is a fault simulation type of thing as well. You don’t know if checkers are turned on and issues are being detected. This is another dimension to verification.