Experts At The Table: Verification Nightmares

Second of three parts: Power islands, validating with TLM models, and the increasing need for software testing.


By Ed Sperling
Low-Power Engineering sat down with Shabtay Matalon, ESL marketing manager in Mentor Graphics’ Design Creation Division; Bill Neifert, CTO at Carbon Design Systems; Terrill Moore, CEO of MCCI Corp., and Frank Schirrmeister, director of product marketing for system-level solutions at Synopsys. What follows are excerpts of that conversation.

LPE: Where does power fit in?
Neifert: Power is certainly involved in all of this. As you start running the software, it stresses the hardware in ways that it never was stressed when you were doing factor-based analysis based on your hardware testbenches. The interplay of hardware and software and the need to do a power analysis based on that is something that I’m seeing increasing numbers of customers using this for. They need to know that when they’re driving down the road listening to music on their phone and a phone call comes in that it doesn’t blow the thing up because they’ve just exceeded their power budget. That’s another factor. Mentor has one approach for it. My customers do it at the cycle-accurate level.
Schirrmeister: We annotate timing and power. The interesting aspect of this is it’s almost evolutionary. There is logic synthesis starting with Manhattan metrics to estimate the distance, and then they had to bring in layout to figure out the synthesis. We get asked by users if we could talk to the silicon vendors to figure out the PPA effect—performance, power and area. We can bring it up to the level where the software sees it. The effects you just described—the device blowing up because you exceeded the power budget—are driven by the software, which drives the hardware.
Moore: One of the practical problems with the people we’re involved with is the verification people and the system people who are using the chip are not the same people. And they’re not on the same budget. And they’re not in the same location. Very small companies can combine all these functions, but there’s a big problem with cross-functional stuff in a big, hierarchical organization. The economies of scale from an organizational point of view don’t necessarily match the project requirements and the economies of scale of the developer. You have to have all kinds of tools and approaches. RTL vs. USB is particularly difficult. There is infinite state space. The developer says, ‘We’ve got the verification IP from the vendor. We’ll just run that. We don’t have to run software.’
Schirrmeister: One thing we’re doing is taking the drivers and making sure the last check in from the hardware and software side didn’t break anything.
Matalon: There are differences between block-level verification and system-level verification in terms of who does it and even the skill set of the people doing it. Today there is a big gap. The people doing block-level verification are primarily standardizing around System Verilog and methodologies like OVM, which are advanced but primarily used for block-level verification. You don’t use much software used to validate those blocks. You see more validation in isolation. On the other hand, there are the methods that user the system level. In the past, what people would do at the system level is use a classic ‘V’ shape, where they would start at the bottom stitching things together until they would get all the blocks assembled. At that point they would say, ‘Okay, now it works and it’s verified.’ The problem is this is happening at the back end, and it doesn’t work anymore. Verification has to start at the beginning, even before the RTL has been designed. You need to verify designs that have not even been implemented in RTL.
Moore: This is the verification equivalent of test-driven agile methods in software.

LPE: Can the two sides really be linked?
Matalon: I believe that you can validate software against the UML that doesn’t do much, but the most realistic validation is when you have a transaction-level platform that at least abstracts your architecture. At this point, you can do system-level validation with a dimension of performance and power. What you do downstream is just comparisons. You build RTL and say, ‘Is this RTL that implements TLMs equivalent in functionality?’ The challenge is that you need to provide all the means at the transaction-level platform to verify that indeed you are meeting all your requirements before you go to the RTL. Even if the functionality is right, the performance may not be right and you might have to go in and remodel the RTL. That’s the change I see going forward, and that is the promise of ESL.
Neifert: That’s how people should be doing things. The reason ESL has really taken off in the last couple of years is that with TLM 2.0, there’s a standardized way of doing all of this. From my side, what TLM 2.0 really enables is the interchange of models at various levels of abstraction. You start a design with these really high-level blocks, but as the design gets refined you can use the TLM 2.0 interface and swap in blocks in varying levels of design based on the task at hand. You may have a high-level processor and a cycle-accurate representation of the TLM for your USB as you’re developing the drivers. TLM 2.0 has really helped enable a lot of that philosophy, which is why we’re starting to see it go beyond the EDA groups doing test projects to widespread adoption across the industry.
Schirrmeister: Obviously there is verification of the architecture, performance and power to then build against going forward. But there are also more mundane tasks. While you have your DUT that you want to verify, which may be a USB block or a piece of IP, you may be introducing spelling mistakes with your testbench. Starting with verification and building the right scenario like I’m playing music while I’m driving and getting a call—verifying those early on with the combined model of power and performance on a virtual model is a big trend. We see that as well.

LPE: How much do the various states—on, off, sleep—impact the accuracy of the verification?
Schirrmeister: A lot. As an example, if you bring up a wireless chip set, you have the process of booting up and then you have the power management IC right next to it controlling the power and the voltage for a particular region of a chip. You can see it oscillating and wobbling while it syncs in. The problem can be larger depending on the number of states you have. If you have five states, that multiplies up very fast when you include voltage regions and cores and different things you can switch on. In the past people did spreadsheets, but it has become so complicated that simulation and virtual platforms are necessary.
Matalon: When you build your RTL and try to evaluate this RTL in isolation, without the context of the software you cannot know what the dynamic power of this RTL will be at the end. It’s the refinement process. You can’t bypass the block-level verification methodologies on a standalone basis. Software will never replace random constraint testing of a block. But you still need a reference platform—I call it a transaction-level platform—to be there as you refine your design. One of the benefits of OVM and transactors is you can re-use those transactors as a bridge. The transactors are bridging the RTL and the TLM so you can run mixed-level verification using software. You can do that on a simulator or an emulator and you can run mixed TLM accelerated RTL verification with all the debug capabilities built in to find problems as you validate your design.
Moore: One of the things that’s really useful about the virtual platforms is that you can develop and debug before you get to the more detailed simulation. You can get rid of all the stupid problems right away so all of your simulation time is spent looking for problems instead of fixing your own mistakes.

Leave a Reply

(Note: This name will be displayed publicly)