Second of three parts: Heated arguments over strategy, accuracy and effectiveness; problems rise as power islands are added in.
By Ed Sperling
Low-Power Engineering sat down to discuss rising complexity and its effects on verification with Barry Pangrle, solutions architect for low power design and verification at Mentor Graphics; Tom Borgstrom, director of solutions marketing at Synopsys; Lauro Rizzatti, vice president of worldwide marketing at EVE, and Prakash Narain, president and CEO Real Intent. What follows are excerpts of that conversation.
LPE: When do you verify closer to the metal and when do you move to a higher level of abstraction?
Pangrle: The higher level of abstraction is the wave of the future. It’s where things are headed.
Borgstrom: It’s been the wave of the future for the past 10 years, though.
Pangrle: But it is different the time. We’re seeing traction with customers and the blocks they’re using it with, and we’re hearing from them, ‘You know, we couldn’t have done it at this time if we didn’t have this tool.’ That’s different. There’s an efficiency that’s only going to ramp from here. But at the same time, we’re modeling in C. If the architects have the same hole with what they’re using and the same way they’re testing it, and that’s what you’re going to measure your RTL against, how are you going to catch a problem? The hole is in the model.
Narain: No, because when the verification engineer compares a mismatch between the RTL and a reference C model, then you can find errors in the reference model.
Pangrle: You can always find errors later in the process. People still check at the gate level and find errors in RTL.
Rizzatti: Exactly. Nvidia is still emulating at the gate level with hundreds of millions of gates.
Pangrle: Doing this higher-level stuff doesn’t preclude doing things at the gate level. But it does give you a higher level of confidence that if I’m taking C and using that to generate RTL, then hopefully I won’t have as many errors.
LPE: When do you know when you need to look at the higher level or at the RTL level?
Borgstrom: It’s really a continuum. Every project starts out as a concept, so you do some high-level modeling and algorithm development. Then it follows the classic ‘V diagram’ development where you go from that high level and make all the components. When you get down to that IP or block level, you’re doing very detailed verification, simulation and formal analysis. Then when you get all those blocks developed you start integrating them and do full chip verification involving simulation and hardware-software co-verification on a hardware-assisted verification platform. You eventually get up to full system integration where you’re prototyping with external interfaces. At each phase of the process you’re using a different set of tools to find a different set of bugs.
Narain: There’s top-level activity and then design starts in a distributed manner. Designers start implementing various pieces of a design based upon a spec, and then it comes together. So the design starts of as blocks, then it goes to clusters, and then to a full chip, and at every point you have a chance to use verification. The big questions are what is the cost, how much investment you’re going to make and what is the return. Typically block-level verification is compromised because there are too many testbenches to develop. People tend to do simulation more at the cluster and full-chip level. At every point in time you apply the cheapest technology. A lot of blocks can use formal verification. When you get to clusters, you need more sophisticated techniques. At full-chip level, you start getting into emulation.
LPE: Analog is separate, as well, right?
Narain: Yes, analog is different and you use your own techniques for that.
Borgstrom: There’s a dedicated tool chain for doing custom blocks and custom simulation. The designer and verification engineer are very often the same person. It’s a very close iterative loop. There’s a real need for doing mixed-signal verification once you start integrating the whole SoC with the analog blocks and making sure that analog-digital boundary is behaving correctly, both from a power perspective and a physical connectivity perspective.
Rizzatti: If you have a problem at the higher level of design in C and you don’t catch it, you don’t catch it at the RTL level?
Narain: That’s correct.
Pangrle: But the reference is the same.
Narain: No, what you’re saying is that you take the C reference model and automatically derive RTL from it. I’m talking about taking a C model and let the designers independently generate RTL. If you automatically derive RTL you won’t find the errors. If you use a spec and let the designers test against that, then it is conceivable that a bug will be caught.
Pangrle: There is a chance of an RTL to C mismatch. But it’s more likely that will be a problem with how the RTL was implemented. If I have some idea of what I’m looking for—and that’s usually defined by the architect—then I’m more comfortable knowing that what I’ve modeled has been turned directly into RTL. If I hand it over to an engineer to figure out my intent and then he goes off and does his own thing, then there’s even more likelihood of a mismatch. The mismatch will be more likely from the translation.
Narain: If you compromise on independence of fundamentals of the checking, then you’re compromising on the integrity of the verification.
Pangrle: This is like having multiple votes. You should have three independent design teams?
Narain: If you can afford it.
Borgstrom: It’s clear that this high-level design and verification flow is relatively new and controversial. But it does show a lot of promise.
LPE: Let’s swap topics. Software is becoming more complicated and so are power issues involving islands and various modes. What does that do to the verification process?
Rizzatti: It’s a nightmare.
Borgstrom: It’s a lot more complicated. The data we have shows that at about the 65nm node, average design team size is evenly split between hardware and software. As we get into smaller and smaller geometries we’re starting to see software-driven architectures where a lot of the value of the semiconductor product comes from software delivered along with it. When you have a huge software team waiting for software to come out, that’s not very economical. One thing chip companies are trying to do is figure out how to get their software teams started sooner. One way you do that is to come up with a virtual platform. You come up with a SystemC, TLM-level model of the overarching design even a year before you have silicon and get people writing software against that. As the RTL becomes more mature, you can put that into a hardware-assisted platform like an FPGA rapid prototype and continue your software development running at 10MHz to 30MHz prior to silicon commitment.
Pangrle: We’re seeing similar things in the market. There’s a shift toward software. Anything you can do to help teams get started earlier on software helps close that whole window down in terms of the amount of time they need to get a whole system up and running. We feel that standards help speed up this whole process. Using TLM 2.0 help is all part of the Open SystemC initiative. You also really want to know what the intent is of the design and how you’re going to partition it up, because that has an effect on how you start verifying it. Being able to determine which blocks you want to run at which voltage levels and which ones you’re going to use to create voltage islands—that’s information that gets passed down for verification. If I run everything at a single voltage, then I don’t have to worry about these kinds of issues. But if it makes economic sense to run a block at lower power, then I have all these other things I have to check.
Narain: Control over the design implementation process is a big problem. There is a strong requirement for software to eliminate errors in the implementation.
Rizzatti: The crossing point between hardware and software, according to Handel Jones, was 130nm. The other thing is that you start with virtual prototyping and after that you move to FPGA prototyping. If that were the flow, it would kill emulation. I don’t see that happening and that is not what the large chip makers are doing. There is a very clear moment in the flow where emulation is unique, which is the integration between hardware and software because the FPGA prototyping will not give you any ability to trace bugs. And we see this more and more.
Leave a Reply