Defining Sufficient Coverage

Experts at the table, part 3: Panelists tackle issues including system level coverage, functional fault models, redundancy and interfaces.

popularity

Semiconductor engineering sat down to discuss the definition of sufficiency of coverage as a part of verification closure with Harry Foster, chief scientist at Mentor Graphics; Willard Tu, director of embedded segment marketing for ARM; Larry Vivolo was, at the time of this roundtable, senior director of product marketing for Atrenta; Simon Blake-Wilson, vice president of products and marketing for the Cryptography Research Division of Rambus; and Pranav Ashar, chief technology officer at Real Intent. In part one, panelists discussed the changing definitions for coverage. Part two discussed the complexity found at the system level and techniques that are being used to prioritize their verification energy. What follows are excerpts from that conversation.

Picture1

SE: Part of the verification process is thinking and describing the functionality in an orthogonal manner. Do assertions force that orthogonality?

Foster: I am always telling my kids that there is no automation for thinking.

Ashar: Assertions give you a calculus for thinking.

Blake-Wilson: Just the process of trying to think through the structure can be a great way to find things that you have missed.

Vivolo: This is exactly what we do today at the block level. The combination of static and dynamic analysis is how you start to extract unexpected relationships between signals. But how do we take that to the higher levels?

Foster: That is the value associated with formalizing the process.

Ashar: You need a framework in which to think and you need something similar at the system level. This can become a specification for correctness.

SE: In automotive there is ISO 26262, and so far the industry has taken the easy way out by replicating the CPU. But this is not a long-term viable option. Now we have to start thinking about coverage for single fault failures in hardware to know if the issue can be detected and corrected. Is this the next problem for coverage?

Tu: Doing that at the system level is very complicated. The automotive guys are concerned with the highest level – does the system operate?

Ashar: A difference is that at the SoC level, failure is the change in the value of something on the left-hand-side of an assignment. It is hard to nail down at the system level what failure is. It can take many different forms. To capture that formally we need a taxonomy of failures and build on that.

Foster: Functional fault models are something that academics have been looking at for a long time. It is huge.

Ashar: The approach being taken right now is one of applying application-oriented standards and applying our understanding of how chips fail to narrow the problem. We need to go beyond that but we are not yet there.

Foster: When we can minimize the interaction of IPs or components within systems, then the problem becomes tractable. Once we start allowing the interactions, that is when it becomes harder.

Ashar: It used to be that failures were inside functional blocks, but the boundary has moved to the interfaces, so failures in SoCs are at the interfaces. The starting point has to be here.

Foster: Most companies understand they have to use an incremental approach to verification. They cannot just jump to the system level. There is no point proving use cases when basic things such as a processor talking to my IP have not been verified.

Vivolo: Even if the failures are at the interfaces, you assume that all of the pieces are working correctly. You have layers of verification and depending on where you are the verification may change but you still have to make sure the state machine buried deep inside your USB core is working.

Ashar: You have to try and avoid multiplicative complexity and that requires a step-by-step process. If you still have to worry about low-level things when verifying at the system level then it will never be tractable.

Foster: I need to architect this to minimize the problem and that means well defined boundaries. There are times when you cannot avoid it.

SE: Does this mean that we should constrain design in order to make verification easier?

Foster: When I used to work on super-computers, I would never allow engineers to use more than 23 keywords from the Verilog language. I constrained them and limiting the degrees of freedom to get the job done is reasonable.

Ashar: We have to think on the flip side as EDA companies. We have to improve the verification technologies so that those types of constraints are not necessary. Think about the reset scheme of the chip. You can have a very simple method where you send an asynchronous reset to every flop. That would make verification very easy and the problem goes away. But I hear companies saying that they could never get their designers to do that. They will never agree to that level of overhead in the layout. There has to be a balance.

Foster: The industry is imposing some constraints in that we are not all going out and reinventing network-on-a-chip. That restricts the way we do design.

Blake-Wilson: Part of the goal of refining the tools should be a better understand of the scope of coverage in order to decide where this makes sense. Sometimes redundancy is still the best solution.

Ashar: That may have been the best way to approach it the first time around, but then as we evolve and the chip has been tried and tested, then you can iteratively improve.

Tu: I think you are describing the ARM success story. It is economies of scale. You have enough designs using the same technology and it gets cleaner and cleaner over time. What we are trying to protect from is faulty design. They (ISO 26262) are really looking for things such as soft errors in memory, so one system may be compromised but if you have a bad SoC design, then the output will always be bad. Software guys have been constraining their software so that it doesn’t use certain types of constructs and limiting the problems in verification and test.

Foster: Limiting the degrees of freedom reduces the space that I have to consider.

Tu: You just have to ensure you do not take away the ability to be creative.

SE: How come there are so few tools that help prevent bugs from existing in the first place?

Blake-Wilson: A cynic may say that a good way to make money is by having bugs that have to be fixed over time. I think you are seeing more tools becoming available, but they are still scratching the surface.

Ashar: You would need a specification language. You can also use static verification as soon as you have RTL and that finds bugs very early. It is not at the system level yet.

Vivolo: It is a scalability issue. It is the same as the question about why we don’t constrain the design styles. The more freedom you have the more potential there is for errors, but then as design are getting larger and schedules are getting shorter, we are using more aggressive technology, so…

Ashar: We are getting more tools.

Foster: More tools are coming about, but you also have to have people use them. Tools have existed that would catch bugs and yet are still not being used.

Blake-Wilson: The market often rewards flexibility.

Foster: Certain markets use certain styles and creating tools that can cover all of the markets is not easy.

Ashar: Education is also important. After a DVCon panel, I spoke to people who were not even aware that certain types of tools existed.



Leave a Reply


(Note: This name will be displayed publicly)