Experts at the Table, part one: Where the holes are showing up in tools and flows for advanced designs. Not all vendors or tools play nicely together, and no one really knows what’s going to happen with finFETs.
Semiconductor Engineering sat down to discuss the attributes of a high-level, front-end design flow, and why it is needed at present with Leah Clark, associate technical director for digital video technology at Broadcom; Jon McDonald, technical marketing engineer at Mentor Graphics; Phil Bishop, vice president of the System Level Design System & Verification Group at Cadence; and Bernard Murphy, CTO at Atrenta. What follows are excerpts of that discussion.
SE: Conceptually, what are some of the descriptors and attributes of a high-level, front-end design flow today?
Clark: Just to be clear, I start after RTL, so I take RTL that someone else has created and try to implement it. Some of our greatest challenges are handling the interfaces because I do everything from RTL through to PG (power grid) netlist, but I’m not doing P&R (place & route). The biggest challenge we are facing now — which may be the same challenge we’ve been facing for a long time — is not throwing RTL over the wall to implementation and not throwing netlists over the wall to P&R because they get closer and closer as the technologies get more advanced. One of the specific challenges I’m having right now is that a lot of the RTL coders are actually software engineers, not hardware engineers, so they write code that doesn’t look like hardware and then they want us to implement it and they don’t understand why we can’t. That’s difficult.
The flow itself looks like a dozen different tools all trying to talk to each other and we have all of your tools and that’s a challenge as well, especially with emerging topics. I work specifically in low-power implementation, like in the nitty-gritty level shifters and isolation cells and switches, and everybody has their own way to specify those and they aren’t always compatible. Compatibility in the flows is a big deal and that’s always going to happen. UPF will get old and tried and true, and it will be good, but there will be something else that will come up. That’s an ongoing challenge. Same problem, different format.
We spend a lot of time working with vendors. We sometimes have to play them off of each other because of one vendor will say, ‘No, we won’t fix that.’ And we say, ‘Okay, we’ll go use this tool. Well wait, maybe we will fix it.’
Murphy: We all play together nicely, don’t we?
Clark: Often you guys do. There are a couple different relationships that go head to head, but for the most part the relationships are pretty good among the companies.
In the flow I work with — and we have over 300 users within Broadcom on the flow I work on — we don’t phase out. We always use LEC and Formality, for example, because they have different strengths.
SE: How do you use all of the tools within the flow?
Clark: We have a wrapper environment around the tools so my group supports that wrapper environment and in my group we have the VSI-LP expert who is also usually the CLP expert. I’m the formal logic equivalence expert so I can do LEC and Formality and synthesis. Then we have the static timing expert who can do GoldTime and PrimeTime and maybe something else soon. So we share our expertise, but if there is a problem in this part of the flow it gets directed to the people who are expert at it. And then we slowly train the users on the things they need to do to make their design compatible with that tool.
Murphy: You said the keywords. IP, qualification, assembly, PPA, verification.
Bishop: I must be at the very high level here because I’m more TLM design and verification. My responsibility is research and development at Cadence for things in the system domain so virtual platforms and prototyping, software-driven verification. Those are the things I think of with the high level, front end of the design flow. Eventually I get to RTL when I’m done.
Clark: When you say verification, you mean of the function, not of the implementation?
Bishop: More functional in timing, but not of the physical implementation.
McDonald: That’s my focus, as well. The one thing that it together across the boundaries is when you talk about your problems in interfacing. That’s really what ties everything together because what we’ve seen with a lot of customers is they’ve got a good flow for one tool, for one environment — one flow, one flow, one process. We’ve all put tools out that do a good job at what they do, but as you start to interface to other things, having the system-level description, knowing what you should do, what you should build and how it should interface is, that’s where people run into problems. That’s where we’re at with the higher-level tools and some of the system-level analysis being able to up front specify functionally—can I test it and make sure my algorithm really works. And that’s not even enough because really it doesn’t work unless it performs the way it needs to, like your comment about the RTL guys give you RTL and they say, ‘Make this work.’ There’s an implied performance characteristic and power characteristic of that, that if they specify something that may work in the system context but can’t be built, you’ve got to deal with that.
Bishop: Something that we’re doing now at Cadence that’s at an interface level is we’re working on physically-aware, high-level synthesis. What we’re doing is trying to understand the physical effects, the timing—the PPA essentially of the actual silicon—and then migrate some of that interface requirements and aspects of the eventual physical design up through the flow. At the high-level synthesis level, there are things you can do. You can’t do everything, but you can link with logic synthesis, physical synthesis tools and you can look at congestion and you can look at certain power aspects and you can at least drive it from a high level, from an algorithmic perspective. That’s one of the big research areas.
Murphy: I’m going out on a limb here. One of the interesting things that I would observe is that one of the ways we were going to bring the implementation of the RTL together was through IP-XACT. IP-XACT in my view is going down for the third time. It is definitely going down. The idea that you can stitch together RTL and registers and TLM and so on, is just kind of falling apart. Both Cadence and Mentor are doing some interesting things in this area, but the actual assembly of the RTL is back in RTL and it’s happening a lot of different ways. It’s happening through spreadsheets and scripts. It’s happening through some very interesting scripting that is not building the whole design but is building these integration infrastructure things, like the way you manage all the clock generation and gating and so on. It’s something you can automatically assemble in the RTL. But it’s in RTL. It’s not in IP-XACT.
SE: Where are the gaps today?
Clark: There’s a gap in the language that we speak between the high-level implementation person who’s going to capture something in System C or even SystemVerilog and the person that needs to worry where they will create the clock. Where’s the root of the clock tree? There’s a huge gap. And especially in low power, because it’s new, there’s a huge gap in the power intent and the architect knows exactly what he wants, but I want to see it in UPF and how do I get from there to there. And once I capture it, how do we make sure it’s accurate?
Bishop: She brings up a good point. UPF/CPF — whatever your format of choice — it’s a bit of a challenge to migrate those aspects up into the high-level design. Some of the things that we’re doing are more architectural exploration for power because we think we can have a bigger impact at that level.
SE: But that doesn’t address what Leah is saying.
Bishop: But that doesn’t address it, no. What you have to do is migrate some of the high-level constraints all the way through to the UPF.
SE: How do you do that?
Bishop: It’s very difficult. Right now, we’re doing a lot of work with the RTL side and by hand.
Murphy: P1801 is working on some stuff in that area but that’s a ways out.
Clark: We have it from lots of different things, not just UPF. We have it for constraints and for the code itself, everything.
SE: That system-level power modeling idea and concept is extremely interesting. What do you think?
McDonald: Absolutely. One of the things that we’re talking about it is, you can be creative at an architectural level, and there are many, many choices on how you can do something and you don’t know what the best way to do it is, until you’ve done it. You get down to the implementation level — you can’t do a waterfall model. You start with something, you have some tradeoffs, you have some flexibility, and then the implementation is going to feed information back up. That has to be a little bit of an iteration between the architecture and the implementation because if you knew the best way to do it, you’d just have done it. But that doesn’t happen. People have to iterate. What we see over and over again is people build exactly what they specified, but it doesn’t meet the needs, or it’s not enough to satisfy them.
Bishop: I’ve been talking about high-level synthesis, and that’s used in the case of new IP typically. But when you bring in IP blocks from the outside, there is a real challenge because you have to tie all of these different IP elements together and then there is the power aspects of all of those IP blocks and how that effects the whole SoC. At the high level, there are huge verification challenges. And there are huge power issues because at the SoC level, that’s a totally different area of optimization over these specific blocks.
Clark: It can also depend on the actual use model because you could have the same hardware for three different devices and the use model is going to be different enough that your hardware implementation may need to change, even if the function is the same. That’s really challenging. The people who are designing the function don’t always have that visibility or they’ll say, ‘No, my design is right; it worked before.’ But we’re using it differently. Maybe it needs to work differently.
Murphy: There’s a new level of implementation. There’s the level of implementation we all know about, which is the place and route, and timing and so on. But there’s now a level of implementation at RTL. This intersection of power and RTL design has created this need to structure hierarchy because you’ve got power domains, so you’ve got to structure the hierarchy. And that power structure is changing. It’s dynamic. It changes continuously up until almost the final drop, and that means you are doing things that kind of feel like implementation in the RTL to modify — obviously what you are doing at the architectural level is ultimately going to have the most impact — but still what you are doing at the RTL level is having a lot of impact on the power.
Bishop: There’s a lot of refinement at that level.
Clark: And it can limit your late changes. We had an issue where we had to change our default domain. It changed the RTL.
Murphy: Much more simply, ‘Oh, this thing I can turn off. Oh no, I can’t because that needs it.’ So now it’s got to be ‘always on.’
McDonald: You’ve got three different ways you’re trying to use a piece of IP, which is basically the same function. And in those three different use cases, a different architecture would be better and more efficient. The power is going to be better, and a lot of times the people doing the implementation don’t know how it’s going to be used. So that’s got to be driven down and the people doing the architecture don’t know the options below. So we need to pass information up and down.
Bishop: I was going to ask Leah real quickly: Do you use finFETs?
Clark: In my specific group, not yet. We have some test chips in Broadcom. From my understanding, the migration is not as simple as a shrink that we’ve been doing in the past. It’s a whole different world. There’s a lot of different physics to the finFET devices. You want different placement constraints. You want to have bigger, higher input cells versus more smaller cells because you’re going to get a better density overall. Conceptually, the library needs to change. Not just the physical design but the library of types of available cells needs to change. I haven’t actually tried to build anything in it, but I would imagine the synthesis algorithms would need to change.
Bishop: That, I’m not so sure. We have one customer that’s on the lunatic fringe and one that I have some work going on with, and the interesting thing is, it seems to be the complexity of the modeling has gone way up and they are spending a lot of time working on BSIM, CMG and specialized modeling so that they can essentially get the right parasitic information and then roll that up into like a .lib file that starts to find its way into the normal design flow. The hope is that the tools aren’t going to change that much, but I’m not so sure.
Clark: From what I understand, in synthesis, you sometimes prefer a lot of little cells and in finFETs, you don’t want that because the cells are bigger physically so you need to tune at least the synthesis tools to pick the 8-input MUX instead of four 2-input MUXes.
Murphy: From a front-end point of view, all of these are obviously extremely important at the cell level and memory level, and so on. But it’s not quite so clear what impact finFET has on how you do logic design.
Bishop: I’m kind of struggling to know if there’s anything we need to do at a really high level, either architectural or whatever it might be, that could have an impact. I suspect that there is, but we’re not going to know enough until TSMC tells me exactly what the parasitic information looks like, and exactly what I’ve got to deal with, and after we characterize the cell library and we do a few designs, then maybe I’ll know more about it.
Murphy: There might be some indirect effects. Here’s one: You can’t use biasing in a finFET, so if you look at the way the MCU guys work, that’s the way they turn down the power most often because they don’t want to mess with the architecture.
Murphy: Because biasing is a body thing and there’s no body in finFETs. So you can probably do more sophisticated clock gating and more sophisticated DVFS. Maybe the guys who historically would just turn the bias knob now have to start getting into some trickier approaches.
Clark: It’s interesting esoterically, but when I think about building it into the flow, it makes me really tired.
SE: Wait, the EDA companies always say that they’re going to hide all of that complexity from you.
Clark: We don’t want it hidden. We can’t design it if it’s hidden.
Bishop: Having spent some of my life on the back end of the design flow, extraction is going to be super different. It’s all 3D extraction because if you try to do 2.5D, which has been this simplification that a lot of EDA companies have done forever, if you don’t have solvers, you can’t extract those parasitics because that’s not a planar device anymore. That I know is a huge change to the design tools and the flow but I’m not so sure where it will show up in the front end.