Experts At The Table: Pain Points

Last of three parts: Planning ahead and making educated guesses; the need for standards; managing risk; the impact of double patterning on design; verification and assertion synthesis.


By Ed Sperling
Low-Power/High-Performance Engineering sat down with Vinod Kariat, a Cadence fellow; Premal Buch, vice president of software engineering at Altera; Vic Kulkarni, general manager of Apache Design; Bernard Murphy, CTO at Atrenta, and Laurent Moll, CTO at Arteris. What follows are excerpts of that conversation.

LPHP: What comes next requires a lot of guesswork in the design, doesn’t it?
Moll: Yes, but until we have FPGAs that do everything we’re going to remain on a two- or three-year cycle. That means you’re going to be well ahead of anyone who’s going to be using it. In these big SoCs, the fact that they’re truly heterogeneous makes it hard to figure out what applications are going to be using it, which portions are going to be useful and which are going to be less useful, how the data is going to flow, what the power profile will look like. There are about five places in the chip where you can locate the display. When you’re turning your tablet around, Microsoft has one way of doing it, Android has chosen three different ways—and they all have different power profiles. All of the hardware underneath is scrambling to take advantage of these options.
Murphy: When you look at software, one of the key things that decides where you split things between hardware and software is virtualization. That has become a method to manage real-time versus regular Android behavior. The question is whether you do it all in the OS, or support it at least partially in the hardware. That will have implications for security and performance. There isn’t an easy answer of where you split things, even for the operating system functions.

LPHP: Who’s responsible when something goes wrong?
Murphy: If it’s Apple, that’s easy. For everyone else, it’s not.
Kulkarni: There were similar questions years ago with DRC (design rule checking). If you get a DRC error, who’s responsible? Is it the place-and-route tool that created it or the foundry. That used to be a very big headache for everyone, but it has dissipated over time. To solve this, standardization has to occur. There needs to be some kind of handoff with handshakes and protocols—and that includes hardware as well as software standards.
Moll: What we see from the middle of the chip is that standards are very important. And because chips are really assembled things, what they really need to do to make sure the pieces work together is to create standards at all levels. You need IP talking to the interconnect, and IP to IP. The verification is also really important. There is the portion where you make sure that everything is talking to everything else, that it’s spec compliant. There’s a big appetite for this because once people start assembling things they don’t want to verify everything, but they do want to verify that it all works together. On top of that, at the architectural level, there’s all the SystemC modeling where you want to make sure the performance is sufficient. People do prototyping and emulation, but that’s extremely slow. To make sure your chip will perform as well as you want it to, you need really good modeling capabilities. If you’re already prototyping and you have a big problem, it’s possible to reverse and do something entirely different. The hardware level standardization of all the blocks talking to each other is one thing. System-level verification is very important, and there are a few standards there that people are working on. And at a high level, there is the high-level modeling that you can run your software and performance on, and you may be able to drive power and other metrics that are very important from the system level. At these two levels, this is where the assembly happens and this is where the magic of making sure what you designed actually happens.

LPHP: Are we getting to the point where we have too many choices?
Kariat: Designs are getting more and more incremental. You can see that clearly in the Apple model. They rev their product and each time they change one thing. They move from one core to two cores. Then they add something else. You can’t rethink the whole system. Most of the market is operating on one-year cycles, so it has to be evolutionary. You can’t go back to the drawing board each time.

LPHP: But isn’t that easier for a company like Apple than a company trying to win a socket in a device?
Kariat: That’s why Apple and Samsung have a huge advantage. They are more vertically integrated. That’s also why you see more companies dropping out of that market.
Murphy: You can’t necessarily remove all risk, but you can reduce risk. That’s driven very much by de facto standards. If you have some level of signoff on IP, you don’t know all of this IP is clean and beyond reproach, but you have some level of confidence it is signed off from certain dimensions and that you can integrate it and not run into certain classes of problems. That’s a valuable thing that raises confidence and reduces insecurity that when you build something, it won’t work.
Buch: With a software program today, you don’t worry about your microprocessor having a bug. You just assume every layer under you works. In the hierarchy, if you go from the lowest level to the software verification, you need to be able to sign off without having to go deep down or you will never get anything done.
Kariat: A lot of the IP is already based on standards, so whenever you have a situation where the IP is supplied by someone else, there is a standard in place. Then there is verification IP, which allows someone to put in a model so they can simulate the system and they can see how the memory is used.
Moll: Some companies have lots of methodology around the tool flow and they may have a longer pipeline because of this, but they get to a higher level of quality and that works for the customers. And then there are guys who are hungrier and they shorten the cycle and hope the IP will be good quality and take more risk like assemble first or go to prototyping faster. There are shortcuts where you can dial your level of risk. If you want to get an edge over someone else and use the same IP, you have to shorten your schedule.

LPHP: One of the big issues facing the chip industry is double patterning. What impact does that have on the design side?
Kariat: Double patterning is definitely disruptive—how disruptive is not clear. It’s something designs need to take into account, but there are practices you can put in place where you don’t have to worry as much about it. There is a tradeoff. If you do more rigid prescriptive design rules you can get away from more of the complexity, but then you may lose density. It’s based on the system tradeoff. It’s a hump that people have to get over. New tools have to come in and people have to adapt to it. At this point, triple patterning isn’t immediately on the horizon.
Buch: There is some impact on placement, some impact on routing, but we are very solid with it. There are routers today that can deal with double patterning. I don’t think it’s going to bring the tools down to their knees.
Moll: At 28nm, when everyone wanted a high-performance processor the sticker shock was huge. It will be even higher in the future.

LPHP: It’s the cost of the design, the IP and the manufacturing, right?
Moll: Shrink was great because you used to get more transistors. Now it costs you the same for one triangle at 20nm.
Buch: It increases your mask cost. The lower layers will be a single pattern. Then when you go to the higher levels it will be double patterning. They are playing with this now. It’s not necessarily an EDA tool problem. It’s a business issue.
Kariat: People may do M1 to M4 as double patterning, and other layers single patterning.

LPHP: Verification has been a problem in complex design, and there is a lot more black-box verification going forward. Do we have sufficient coverage so we are confident what comes out it will work?
Moll: The reality is that nothing really works perfectly, but it works well enough. These days, if you have a huge chip you know there are hundreds of bugs buried in it. You work around that in software. More gates means more bugs.
Kariat: When we talk to customers, we hear the same discussion about quality and coverage for software.
Kulkarni: One of the problems we’re hearing involves the testbench itself. They’re either ATPG or APPG. What will exercise the chip or the subsystem for verification? That’s the first problem. There are huge testbenches. Tablet designs are now at a 1.1 billion-gate equivalent. It’s not even gaming GPUs, which are known to have that kind of complexity. That’s the next challenge I see.
Murphy: An interesting trend is assertion synthesis. As these things get bigger and bigger, writing dedicated testbenches to simulate is becoming non-viable. What you really want to do is run the applications on them, but then you want to have visibility into any strange things going on as those applications are run. The challenge is that it’s hard to write assertions. There’s a very specialized skill set to writing assertions. Is there a better way?

Leave a Reply

(Note: This name will be displayed publicly)