Defining Sufficient Coverage

Experts at the table, part 2: Panelists attempt to define how to contain the complexity found at the system level and techniques that are being used to prioritize their verification energy.


Semiconductor engineering sat down to discuss the definition of sufficiency of coverage as a part of verification closure with Harry Foster, chief scientist at Mentor Graphics; Willard Tu, director of embedded segment marketing for ARM; Larry Vivolo was, at the time of this roundtable, senior director of product marketing for Atrenta; Simon Blake-Wilson, vice president of products and marketing for the Cryptography Research Division of Rambus; and Pranav Ashar, chief technology officer at Real Intent. In part one panelists discussed the changing definitions for coverage. What follows are excerpts from that conversation.


SE: In the past, coverage was well contained in that it talked about functionality at the block level. Today, it includes the system level, but new markets such as automotive are redefining what coverage means. How do we get our hands around the problem of defining when you can tape out a chip?

Ashar: We have discussed ways to narrow the scope of the problem and it has to happen along multiple dimensions — the application perspective, the implementation perspective, which means the types of failures that can happen in the context of an SoC, and the what can be solved analytically. At the lower level we have assertions to guide things, and these provide a Coverage metric. At the system level, you have to have some sort of description of the system, which can serve as a benchmark for how much coverage you are getting in terms of testing the implementation. The analog for assertions at the system level is an abstract behavioral description of the functionality, which could be a protocol or an end-to-end system functionality. Then you can measure the RTL simulation against that.

Foster: You are defining a decomposition to the problem where you are focusing on the IPs but not refocusing on that at the next level. Whatever I used to convince myself that there was low risk at that level, be it formal or some other technique, are fairly well understood. It becomes fuzzier from this point on.

Vivolo: You raise the levels of abstraction as you go up the chain. In order to deal with the scalability issue you focus on coverage at the lower level — how well have I tested a particular design. As you move up, everyone wants to measure coverage, but one way of approaching it is to ask, ‘Do I see anything I didn’t cover earlier?’ You put a shell around the design that says: here is the state space as defined by what I tested. Now at the SoC level, do I ever see a case where I step outside of the known good state space?

Foster: But that is not how verification is done at the SoC level. It is purely use-case driven, and that is where you run into the problem of identifying which use-cases are missing. That is all software driven. The focus is on, ‘Is this doing what it is supposed to do as a system?’

Tu: You have to abstract something, because at the application level there is an intent for the SoC. So what is the intent of the application?

Vivolo: Exactly. How do you take an infinite number of possibilities and boil that down to something simple enough so that when the application does something it shouldn’t do, you can detect it?

Foster: You can learn and do data mining for the outliers, and this is an area of research. There are many opportunities that are slowly being explored.

Tu: At the application level, there is limited stimuli that can even come in. So, I don’t have to do every scenario for the SoC. I can limit it to the input levels that are coming in.

Vivolo: At the SoC level and above, if you just look at the interfaces, that will be very difficult to do. Today, you have to narrow it down to the individual blocks. It is not possible to exercise every tree. But if you can detect where you have exercised a tree in a way that I have not seen before, then knowing that is important because you have something you have never seen before. This will allow you to focus your attention.

Foster: It comes down to the idiom that if you haven’t touched it, it is probably broken.

Tu: This is the same problem that we saw with IP reuse. You have a block that has been silicon-proven three times, but it is now being used in a different way and nobody tested that.

Foster: The configuration space for IP is huge.

Vivolo: That is another major challenge.

Ashar: It can be helped using formal techniques. Formal analysis may not always complete, but it has the ability to get you into parts…

Tu: It is the combination of technologies. You can use dynamic simulation to clean up the constraints, and then when you have a narrow area that formal can target, then the scalability issue…

Ashar: What happens is that the verification scaffold is in many cases over-constrained. That causes you to miss a lot of scenarios that could happen in practice, and formal analysis has the ability to tell you if there is an over-constraint that would provide a vacuous analysis.

Vivolo: Property generation through dynamic simulation and formal work together very well. You can use the two together to clean up the constraints on both ends.

Ashar: As you go up the complexity levels, I see three things as being important to keeping the problem tractable. One is abstraction, so you can get rid of a lot of details. The second is that you should have been able to complete a lot of the verification at the implementation level so that as you go up you can focus on the system level. The third is to be able to uncover use cases that are possible and may have been missed if you tackle the problem in a piecemeal fashion.

Tu: As you raise the level of abstraction, that increases the probability of a security issue or allowing one to get past. At the system-level when you have malicious code and maybe thousands of registers contained in complex pieces of IP and you try and abstract it.

Foster: Not necessarily. Some abstractions are guaranteed to be safe. That is key and depends on the type of abstraction. If this is the legal state space, and it is guaranteed to be secure on that, then I can provide an abstraction. But there are unsafe abstractions.

Blake-Wilson: One concept I am used to is defense and depth. In this context, coming at the problem from a number of different directions is what I thought I heard, but it is actually a little different. In the traditional testing world, it is about having different tools and trying to apply them to where they best apply. Divide and conquer.

Vivolo: If we go back to my earlier comment about defining what you have already tested and going beyond that as a potential issue, one could argue that you could extend that to security and say that I have seen these behaviors, and if I see software trying to do anything differently it could be a potential attack.

Tu: Looking at it from the perspective of the tier-one customers, traditionally they would have asked for test coverage and they would want to see a big number. How do you convey that for an SoC today? It is too complex to do it this way. You cannot cover everything, so how do you convince them?

Foster: I have seen people ask for the use-cases. But if you forgot a use-case, and the interaction of use-cases can be complex. They provide a one dimensional view of the problem.

Ashar: This is not a new thing. In the past, there used to be a term called feature interaction. Call forwarding and call blocking were two associated with a telephone. They could interact in ways that were not foreseen. You will find use cases that had not been considered.

SE: If we know at the system level that it is impossible to cover everything, then how do you prioritize and make sure you have considered the most important cases?

Foster: One way I have seen is that use-cases are driven by the architectural team or the customer. Then I start listing the use-cases for the error modes or conditions. What is really missing in that is the concurrency aspect. Humans have a difficult time reasoning about concurrency, and that is where the complicated bugs are. It is easy to think about corner cases but hard to reason about the interactions.

Tu: That is why the microcontroller is much easier. There is a lack of concurrency, and that is where complexity has sky-rocketed.

Ashar: The complexity was there, just not in the chip. And that made it easier to fix.

Foster: But you asked how people do it. I believe that there is still a lot that can be done with machine learning techniques and data mining. We need to start looking for the outliers so we can start to uncover the things nobody thought about. We need that kind of solution.

Blake-Wilson: It is all about looking at stimuli and working out what are the likely values. This is meat and potatoes to an attacker. If we know they tested for this range, then let’s see what happens when we look at a different range.

Vivolo: This is about use-cases that nobody thought to test for.

Ashar: In the IoT and automotive world, there is some expectation about how things will interact. There are also many interactions people haven’t thought about. The RTL design processes formalized some of it through assertions, but these are not assertions to use in a formal analysis tool. A similar formalization of intended behavior and interactions is required for these markets. This is the starting point and this can lead to a more analytical method or at the least a better measure.

Foster: I have found that the value of assertions is not what you end up with, but the thought process that they went through to come up with it.