Experts At The Table: Performance Analysis

Last of three parts: IP integration strategies; which tools to use and when; developing executable specs; the need for better and different IP characterization; limits of traffic generators.


Low-Power/High-Performance Engineering sat down with Ravi Kalyanaraman, senior verification manager for the digital entertainment business unit at Marvell; William Orme, strategic marketing manager for ARM’s System IP and Processor Division; Steve Brown, product marketing and business development director for the systems and software group at Cadence; Johannes Stahl director of product marketing for system-level solutions at Synopsys; and Bill Neifert, CTO of Carbon Design Systems. What follows are excerpts of that conversation.

LPHP: How do we deal with all of the IP blocks—internally and commercially developed—that are now being used in complex designs?
Neifert: If you attack this as an assembly problem, that’s how you keep up with all of this stuff.
Stahl: I disagree. At the chip level, it comes together. You cannot separate it out.
Neifert: You certainly have to assemble it, though.
Stahl: But if you do not simulate it in an assembled way you cannot optimize performance. If you simulate at the RTL level, it’s too close.
Neifert: You have to do that.
Stahl: But if you run it together at RTL, that’s too low.
Orme: You have to use at least at some level of abstraction before you bring the subsystems together. Otherwise it’s too late.
Neifert: You don’t want to put Linux on that system, though. It will take you forever.
Brown: You need to boot Linux early with the parts of the system that are performance-sensitive, so it can be cycle-accurate. That’s what customers are doing to address what you’re talking about. You have to do systems integration to address things early. It’s a hybrid of abstractions.
Neifert: People run on fast models until they actually care about something in detail. So they may want to analyze the performance in a specific place. Then you go from the fast models to the accurate models. You can use emulators and FPGAs for that.

LPHP: What you’re looking at here is more tools in your tool belt, right? You pick the one that works when you need it, which may be behavioral verification and formal verification in addition to functional verification.
Stahl: Yes. Customers will use whatever they can get a hold of.
Orme: That’s why we work with so many different EDA companies, which all have different approaches.
Neifert: I was talking with one customer who has an FPGA, an emulator, and virtual prototyping. At first they use whatever they can get, but after that—when they’re all available—they self-select depending on their requirements.

LPHP: Do customers really have that much choice?
Kalyanaraman: The tools are there right now at the higher level for architectural exploration, although there is less for performance exploration. A lot of companies are trying to bridge the gap between the individual pieces and when you put it all together. There is some measure between what the performance architect is trying to spec out and what the verification team is trying to do. You can at least try to bridge the gap by understanding what the interconnect issues could be, for example. There is emulation, and there are tools up front that deal with the interconnect. Expanding to the higher level, it would be good to have tool that can get that same kind of capability as emulation up front. At the system level can you detect performance issues?

LPHP: It sounds like we need standards for this.
Kalyanaraman: Today, architects are able to spec things out, but they cannot run anything because they can’t use emulation. That comes later in the cycle. We are able to put all the IPs together. How do we assess the performance?
Stahl: The architect specs out something. If he spends millions and millions of cycles on this spec on a good enough model, it should be correct. Why would you implement something that’s not worthy? But there should be a requirement to do more thorough simulation earlier. And your world should be fairly well explored before you start putting something together.
Neifert: We all have solutions in this space and ways to solve this. They all revolve around running various scenarios so you have an executable spec that’s an accurate representation. At a bare minimum it should include the interconnect, but you also should have the memory subsystem tied in there, as well. Hopefully your highest traffic generators have been flushed out as accurately as possible, as well. Otherwise, when you tie this all together into a real system it won’t work at the performance level you thought it would. The earlier you get that running, as accurately as possible, the better. If you’re waiting for emulation to do it, you’re too late.
Kalyanaraman: One thing that would help is that for companies that make IP, along with they IP they give model signatures. I don’t need the signatures to run a test. But I do need the model signatures so that I can push it through my traffic generator. If you want to operate IP on a certain mode or change a knob, you need to know everything the IP can do and how it reacts to things. Today we are integrating AXI DFM because our common bus standard is AXI. That does not represent the true signature of these IPs.

LPHP: So the IP is not being characterized at the level you require?
Kalyanaraman: That’s correct, but you still need that.
Orme: We’ve been looking at that in conversations with tools providers. What you’re describing is what are the characteristics that have sensitivity on the system. People might say they need X megabytes, but that’s a crude specification. You have to show the characteristics of the profile, and interestingly enough, that can feed directly into verification. Given those characteristics, you can then try out the range you think an IP might be capable of delivering. Programmable IP may have very different signatures, depending on the software its running. And then you can experiment. So if you’re building a memory subsystem, it can support these different characteristics.
Stahl: Writing a generic traffic generator for a piece of IP that is highly configurable is a non-trivial task. If they have a very specific use case, then writing a traffic generator is not all that complicated.

LPHP: So the gist of this is that complexity is forcing IP vendors to characterize their IP in ways they never considered?
Stahl: Yes.
Brown: Characterization and models.
Kalyanaraman: That’s the key. Typically, in the SoC integration, the focus is mainly, ‘Let’s get on with working.’ There’s not a clear understanding of the character of the IP and what it does under certain situations. This is why it’s so important that we get very accurate signatures and models. It has to come together as a package so we can plug it in and try it out. We have seen around interconnects some extraction values, but it’s very important we get to the verification engineer what is the number of outstanding transactions. Nobody knows. It has to come from the IP. What is the IP capable of supporting?
Neifert: You can do things with traffic generation, but you can miss things with traffic generation, as well. You can solve a lot of that by using the actual model. Traffic generation is great for a certain aspect, but you can’t catch all of the functionality and make sure you’ve done all the arbitration unless you’re using the real IP.