Experts At The Table: Performance Analysis


First of three parts: New verification challenges; the impact of more IP and integration; fitting within a power budget; what’s missing from the flow; performance vs. power.

By Ed Sperling
Low-Power/High-Performance Engineering sat down with Ravi Kalyanaraman, senior verification manager for the digital entertainment business unit at Marvell; William Orme, strategic marketing manager for ARM’s System IP and Processor Division; Steve Brown, product marketing and business development director for the systems and software group at Cadence; Johannes Stahl director of product marketing for system-level solutions at Synopsys; and Bill Neifert, CTO of Carbon Design Systems. What follows are excerpts of that conversation.

LPHP: How do we improve performance in designs, and what are the hurdles?
Brown: There is still a lot of focus on the creation of the architecture, and we have solutions there that are still not mainstream yet. But there is a big challenge on the verification side. It’s a new responsibility for the functional verification team to validate that the performance of the architecture is being met by verification. This is an emerging process. Corner cases are escaping.
Neifert: The performance problem is changing as design methodologies are changing. People used to differentiate SoC designs by adding large amounts of their own IP and improving performance with that. But with shorter and shorter time-to-market windows, people are re-using more and more IP or purchasing it from other people. Performance used to be how you created the design and iterated it down from spec to implementation, but now it’s about what you can eke out of the third-party or re-used IP you’re getting, as well as making it work with the other IP you’re adding in there. The ways to achieve performance are changing. With third-party IP, you don’t have that deep understanding of the IP anymore. Turning knobs and changing the parameters may mean different things than you expect. A lot of times the only way to figure that out is by trying various pieces together. Any one piece by itself doesn’t perform the same way. The performance problem is a system problem, and hopefully you start understanding how all the pieces work together as early as possible.
Orme: The fundamental problem comes from an increased level of integration. There’s a lot more IP that is competing for resources, such as memory bandwidth. This leads to solutions being generated from IP providers such as ourselves, where a lot of performance-enhancing capability is made available to the designer to use. That capability leads to the need to close the circuit, and once that work is done people don’t want to do that kind of analysis on several solutions. Most design tasks have multiple good solutions. Taking those and measuring what you get is a good way of closing the loop and making sure the performance is verified, as well.
Kalyanaraman: From a verification standpoint, the main problem is that it comes too late in the design cycle to effectively deal with SoC performance. But at the same time, performance is a system problem. When you integrate lots of IP in an SoC, you need to analyze it quickly. You need to do architectural analysis or integration analysis. It’s much harder to solve later, when you move from front-end pre-silicon validation all the way to emulation and silicon validation, because all of a sudden the magnitude of what you need to do to check out something on a chip goes from a few hundred cycles to millions of cycles.

LPHP: So the basic problem is data overload?
Kalyanaraman: That’s part of it. The other part is that there is no systematic way today to solve it. Everything works on a common bus standard, but because there are so many options for IP and because some of these knobs need to be turned at a system level, it makes it extremely difficult to solve at a front-end validation level.
Stahl: The performance-critical part is how you move the data. What you do with the interconnect will determine, to a large extent, the performance you get out of it. We believe that to deal with performance effectively you have to start in the architectural space where you define how to achieve that performance. Once you have done that, you can turn the knobs on the IP. That may give you very good performance, but then you have to go to the next step. At that stage it will be very important that the RTL verification understands what was done earlier. There may be a connection between what was done earlier and the RTL validation. You may find something at the RTL level and then you can go back to the architecture and ask, ‘Am I making the right assumption?’ But fundamentally, you are too late in the process to alter the architecture.

LPHP: We keep hearing about performance being good enough these days. Where is it still the driving force?
Neifert: You can’t look at performance without power. Power certainly dominates, but if you’re using a smart phone and you can’t switch quickly enough and there’s a poor user experience, you’re going to try something else. It’s enabling you to make that power-performance tradeoff that’s what it’s all about.
Orme: The design is power-limited. That is putting a constraint on your design team. A classic example is that you can only afford a certain memory interface. That, plus a system, puts constraints on your design. Within that power constraint, you have to get to the maximum performance with less than 1 watt. If you can get to a higher point within the same power budget, you win.
Kalyanaraman: It’s really a zero-sum gain. You have to increase performance and lead with power issues. When it comes to performance issues, when you put all these IPs together there are certain interconnect functions that may be broken. So choosing certain IPs doesn’t necessarily mean there will be a reduction in power. It might be that the IPs need to be configured in a certain way. Some of the issues we’ve found had nothing to do with making a tradeoff between power and performance.

LPHP: How many of the architectural decisions are tied to whether you can verify the design on the back end?
Stahl: Of course you have to validate the final, final version of the architecture. However, to get to the right level of performance, you have to model as many application alternatives as you can think of for your chip. If you’re not modeling the majority of that and simulating, you cannot be sure you’ve picked the right architecture. Or, you pick an architecture that’s overdesigned.
Brown: That’s what we see, as well. It’s really important that the architect understands where the performance-critical parts are in the system to start to make the tradeoffs about modeling. The way you do architecture and performance design in the beginning is that you have to model around the parts that are performance-sensitive so you can do some real analysis early of that performance-sensitive part. You have to have a starting point, but you also need a solution that allows you to be nimble if you’re wrong or you discover you’re slightly off in the performance of the performance-critical part of your system. You need to adapt to do that design properly at the beginning. We have more and more tools available for customers to be nimble and to explore more, but they don’t have complete flexibility. Creating the model and design around that is not free or zero time.
Stahl: This is not an either/or. You cannot model power by your model performance and simulate both together at that level. You cannot make that decision. You will first go for performance goals, because that’s what the market wants. You want to go with the next device for the market, and whatever the power is, that’s going to be the power. But if you can model both at the same time, you can make tradeoffs. You may put a little more in area if that’s needed.
Neifert: It’s not just an architectural problem. The system is more than just the architecture and the hardware. You have to deal with the coherency of the software, too. Software can drastically change the power of the system. Maybe it’s just a barebones architectural model, but you need to get that up and running as fast as possible with an eye toward how the power is being used. If you leave that out of the equation and bring it in late, you might miss big chances to make big impacts on power.
Stahl: The hardware architecture has to go through many, many use cases, particularly for the high-performance processes. Then there is the operating system and firmware, and all that has to run and be optimized to make the best processor implementation. Then you have a power model to show what’s happening with the software.

Related Stories: