Experts at the table, part one: The role of system-level verification is not the same as block-level verification and requires different ways to think about the problem.
Wally Rhines, chairman and CEO of Mentor Graphics, gave the keynote at DVCon this year. He said that if you pull together a bunch of pre-verified IP blocks, it does not change the verification problem at the system level. That sounds like a problem.
There are assumptions made that the IP blocks work to a reasonable degree, and that when performing system-level verification the focus is not about finding issues in the individual blocks, but in the interactions between them and a host of new issue types. While some bugs may be found, that is not the focus.
SE: What is your interpretation of Wally’s comment?
Brunet: Wally is 100% correct. Consider power – he meant that just because they have verified the IPs independently, when they are put together everything still has to be verified. Power is a big example. Just last year we had a lot of visible customer names who had problems with power. They did not fully verify the chip in the context of power. They did not verify the operating system being booted or verify the behaviors of all of the blocks together. Each block was functionally correct, but from the power perspective, they did not check that the consumption of the chip was right. You still have to verify functionality and power for the whole chip.
Melling: It is still a big problem, just a different problem. Looking at it from the state space and looking to do infinity minus verification of the entire state space is impractical and not the right way to look at the problem. The problem at the system level goes beyond functionality. It is not just asking if these two pieces of IP can communicate, but what is the bandwidth and latency. You are crossing however many clock or power domains as you go between those two pieces of IP. Even though they may look fine functionally, they may not meet their performance requirements. System issues are a little more complicated from a verification perspective and as a consequence it is a big job. It is those problems that create issues even if the IP is working.
Anderson: It depends on how you look at the question. At the top level—chip or system—you should focus on verifying integration, use cases with multi-IP scenarios, power domains, clocking, security, that sort of thing. You should not be trying to verify all the modes of low-level blocks from the system level. Using well-verified IP doesn’t change what you should be doing. However, if you lack confidence in your block-level verification today, then you may be trying to verify them from the system level, a huge task that’s just about impossible. In that case, moving to well-verified IP blocks may enable you to focus on the right issues at the system level.
Lapides: There is an interesting term that is being used to capture some of these system-level verification issues: extra-functional properties. Things like power, latency, safety and security are all being lumped into this term. We need to start worrying about these things. While there has been work on individual pieces, mostly in academia, there has been little to bring them all together.
SE: With IP and reuse there is some degree of encapsulation and not all of the state space is shared. Access to some states is limited. The IP reuse model is becoming more refined in that it is controlling how internals are exposed to the external. UPF 3.0 is an example that exposes the power aspects of the internal working. This means you should not have to care about how that power is implemented within the IP block.
Brunet: But you still have to verify that each IP and its interaction within the SoC and that it does what is intended from a power perspective. That is a challenge.
Melling: During different operating conditions as well. How does it operate in this particular power state when I power down half of the processors. It is this kind of interaction that has to be looked at.
SE: How do we deal with things like performance? This cannot be measured from a single run, but requires many runs to get average figures, fairness, etc.
Anderson: Good point. I should have mentioned performance in my list of things to focus on at the system level. People traditionally have thought about trying to measure performance using production software, which runs much too slowly in RTL simulation. So people either move up the chain to virtual platforms, or down the chain to Emulation or FPGA prototypes. There is an alternative that we have found works rather well. If you can generate very dense test cases that exercise multi-IP scenarios in parallel while keeping all the processors, I/O channels, and memories really busy, then you can get a realistic approximation of performance. You can’t write these sorts of tests by hand. You have to generate them for multiple platforms using portable stimulus techniques.
Lapides: Performance is a real issue. People have been going down this path thinking that cycle-accurate simulation may be the way to go. Some people contend that it is too slow to run the number of scenarios needed and emulation suffers in that respect as well. It is too slow to capture all of the needed performance scenarios. On the other hand, system simulation with a virtual prototype or a model at the Matlab/Simulink level does not have enough accuracy. So there is a struggle to be able to run enough scenarios, have enough accuracy and have the appropriate models and information to make reasonable judgements about performance.
Brunet: I would agree that emulation or even FPGA prototypes are not designed to check that. They are designed to check functional behavior and a lot of cycles. The only way to check performance is to use a very high level of abstraction, or you can wait until you have a physical chip connected to a target. In the middle you cannot run at full speed.
Lapides: We have seen universities work on virtual prototypes for performance estimation and they have gotten tremendous accuracy: +/- 5% for a Cortex M3 that does not have a cache and is a single process or core. The next step is to go to an A9 dual core which has cache but really nothing that is terrible complex. This is still 7- or 8-year-old technology. Trying to figure out how to do performance estimation is a major challenge.
Melling: There is a chart that looks at the SoC verification problem at different layers. The one you are talking about fits into the one labeled interconnect verification. It really starts there and we look at interconnect at multiple levels. The first level is, ‘Does the interconnect do what I want?’ You can hook up some traffic generators, and do I get the right operational behavior, does it match the protocol, etc. Then there is the memory sub-system containing the controller. Now you want to look at master to slave transactions, quality of service, etc. Then at the SoC level, you have all of the IPs interconnected. How do you inject stress traffic so that you aggravate the entire system so that you get those readings? It is throwing a bunch of information into the system, collecting a lot of data, and doing huge amounts of data analysis. It is important and it is challenging, but it is something that people have to do. Once you have those layers in place, you can go down to lower layers and make changes and you can do sanity checks and find out if you are still okay. But you have to invest in that problem and do the analysis.
Brunet: I am curious. Do you believe that the main problem in the market today is performance?
SE: There are many products that are performance based in their specs. A processor vendor cannot put out a product that does not meet its benchmarks. People look at performance, and while we now know that GHz does not define performance, it is still an important number.
Brunet: What limits performance is not raw performance, but in fact power consumption. The way to control performance is to control power.
Melling: We can talk about power but many of the mobile providers got caught by AnTuTu which was all about performance.
SE: Power and performance are a tradeoff.
Brunet: We do see more and more customers running full benchmarks. The main goal is chip performance but in the context of power. There are others such as security. We do see that trend and they all mean more data and how the chip will be utilized within the application.
Melling: Yes, we have to do something different than functional. It is use cases. How will it really be used? What shows us how it will operate?
Anderson: Generated test cases are also very good for testing low-power features. You must make sure that your chip still works under all legal power scenarios and also make sure that performance targets are achieved regardless of power mode.
Brunet: Five years ago it was all about a testbench and how do I write it. Coverage that is efficient when pulled from the emulator or from a prototype or virtual platform. It is less about those things today and more about how do I verify the chip within the context of the end usage. It is the overall operating system, the firmware, a small sequence of application and you are talking about 500 million cycles.
SE: There are billions and billions of use cases that could be used. How do you decide which ones?
Melling: You start from the customer requirements and how it is intended that the device be used. You look at those first, and then customers hone in on the most complex system areas. It is the cache coherence, I/O coherence, it is power management, it is in the places where the special functional elements are creating the challenges and having the maximum impact on the usage of the device.
Brunet: I think the use cases are self-defined by the vertical market. In mobile or multi-media they will look at how video streams go, the interfaces, power in mobile but less in some other markets. Networking is looking at packet generation and making sure that packets are not dropped. There you have to interface to virtual Ethernet generation etc. We can cover more use cases than was possible five years ago.
Anderson: Use cases fall naturally out of the specification, dataflow diagrams, and test plans. Customers generally know where their biggest risks are, with cache coherency being a common example, but they don’t necessarily know how to write the tests. So again, generated test cases provide an answer. You can never verify every possible use case scenario, but you can at least cover all the important categories.
Melling: The old adage that you are never done verifying will always be true.