Experts At The Table: Concurrent Design

Second of three parts: Crosstraining; productivity for experts; where the real costs are at 22/20nm; how resources are utilized by different sizes of companies

popularity

Low-Power Engineering sat down with Marco Brambilla, ASIC design manager at STMicroelectronics; Charlie Janac, president and CEO of Arteris; Mike Gianfagna, vice president of marketing at Atrenta, and Javier DeLaCruz, director of semiconductor packaging at eSilicon. What follows are excerpts of that discussion.

LPE: Is there cross-training going on to allow for concurrent design?
Brambilla: Yes, but the first step is that you need the teams to know what’s available. That includes training the managers and having good internal discussions and distribution of knowledge. At the initial phases you need the packaging guys in. You need the test guys in because if you put in an embedded DRAM and it takes three minutes to test, that’s not an option. We have the packaging, test, the back end and all the functions.
Janac: Do you have people in ST that are responsible for the overall methodology?
Brambilla: Yes. It’s a little more bottom-up, though. We know what kind of ASIC we do. Every division in ST has a more functional approach because we do it all. So we have central R&D that goes with a reference flow and tools. And inside the divisions we have dedicated people who think about what is the best flow to implement what we do. But design teams no longer have time to think about why they should invent the next clock distribution? I want someone to tell me that with this kind of complexity you go mesh.
DeLaCruz: Are you doing the same number of tapeouts now as in the past?
Brambilla: No.
DeLaCruz: So here’s the problem. No one is doing nearly as many tapeouts now because what used to be $100,000 for a mask set is now $3 million.
Brambilla: That’s not the big issue. The big guys with 60% or 70% market share don’t care about the cost of a mask set. The problem is productivity. You need 4x productivity at each new node. I had an ASIC at 65nm with 25 sub-chips, and every piece of this thing was different. So we will need 100 sub-chips for the next version at 22nm. It’s not the $1 million or $3 million for the mask sets. It’s the $40 million or $50 million to develop the ASIC.
DeLaCruz: But there’s also the issue of having all these high-end specialists around. If you’re going from 25 chips a year to 4 chips a year, then you have all these people who are going to be intensively involved in the chip for five or six weeks. You can’t have that. It’s going to drive the need for cross training and concurrent design. You can’t align things vertically anymore. You need broad levels of expertise.
Brambilla: I hear what you’re saying, but the big issue we’re seeing is productivity. We don’t have people idle because they’re doing fewer chips. To do those four chips today I need the same amount of people, plus some more, that I needed to do 16 chips at 65nm.

LPE: Where are the biggest problems in concurrent design? Is the software and hardware, verification or something else?
Janac: It’s basically about wires and gates. The gates scale but the wires don’t, so you need a better way of managing the wires and assembling the SoC. You can’t afford to re-do all of them in the next generation, so one of the big issues of IP re-use is how you support the protocols those subsystems communicate in, how you get them integrated easier into the next generation of the chip. It all comes down to architectural improvements to get to the next generation.
Brambilla: The next time you do a chip you need more bandwidth. Your Verilog is probably useless—or at least it’s not efficient. It was efficient when you designed it in that node. If you change the frequency there’s a problem.
Gianfagna: You’d need to change the microarchitecture, which is hard to do with Verilog.
Brambilla: Yes, so you’re redesigning it. To me there is a big issue every time you change from software to hardware, which is co-development. When you go from RTL to the physical world it’s more co-development. When you go from silicon to the package that’s more co-development. It used to be more than just separate islands. They were like separate continents. But the infrastructure today doesn’t help you as much as you need to increase productivity. I would need to move people just to describe the algorithms and have some tool generate the RTL, but that tool should generate the RTL knowing there are physical constructs. The RTL should be able to predict power and congestion issues. Today we have problems of power integrity because at 32/28nm and 22nm the density of the gates cannot be supported by the power grid.
DeLaCruz: What if you use two pieces of silicon instead of one? How do you deal with your structure then?
Brambilla: You can only handle that at the top level. This is something that requires training. It may make sense to do 5mm square on both ASICs and create more efficient communication between them. It costs more, but it may shave three or four months off the development time.
Gianfagna: What you’re describing is the need for better methodology with a globalized company and more localized infrastructure to use those resources. ST is a big enough company to have the resources to make that work. But what about the guys who don’t have that luxury? There are a lot of fairly large fabless companies that don’t have infrastructure to allow that to happen. How are they going to get to this new level of integration and new way of working? That’s a big challenge.
Brambilla: I know of five companies today that do ASIC services through the fab.
DeLaCruz: But ASIC services can be another island, unless they’re totally integrated with their supply chain.
Brambilla: It’s a huge problem.
DeLaCruz: Historically, there were design services for chip layout and packaging services. You can’t isolate those. It’s easier to get people to overlap in the same company. It’s really difficult to get people to overlap in different companies.
Brambilla: That’s why ST decided not to go fabless. At 20nm, if you don’t control the process how are you going to tune your back-end flow? How much does it cost to run silicon at a third-party fab to verify if your mesh clock tree or H-tree work?
Janac: But don’t the big guys have process teams? Guys like Qualcomm are basically running their own process.
Gianfagna: Yes, and if you look at their org chart you’d swear they own a fab.
Janac: But what you’re saying is that’s not the case with medium-sized companies, right?
Gianfagna: Yes, there are a lot of those companies.

LPE: In 3D stacking you may have a platform developed by a large IDM bolted onto something else. Does that work with the existing players and infrastructure, or do we need to re-think the design process?
Janac: If the bridges are well defined, you can make that work. You can envision an analog die in 90nm and another die in 22nm going to a memory. As long as the way it comes together is well defined, it should work. I don’t see another choice. Otherwise these mid-size companies go to FPGAs, or they become IP providers, or they die.

LPE: What you’re talking about is concurrent design across an ecosystem, not just within a single company, with a focus on everything from interoperability to power.
Janac: That’s right.

LPE: But it’s never been effectively done.
Janac: Companies like ARM can organize an ecosystem across multiple generations of products and multiple companies. We need to see more of that. If someone defines a 3D silicon methodology it can work. There aren’t other choices. A small guy cannot afford to make a 22nm chip. They may be able to go to a company like eSilicon, but there won’t be enough capital around the small and medium-sized guys to go to the latest nodes.
Brambilla: If you’re a startup, you need to prove your technology. If you’re lucky you can prove it at 90nm and then you hope you can be bought. If you’re trying to prove it at 20nm then your best bet is to be part of another company’s mask set. If you’re very small, you might have to wait until there are enough contributors to that mask set. It is true that you also need the ecosystem outside, and you will need some way of describing that—almost a super version of IP-XACT. But inside the ASIC we need to start solving the need of automating the tradeoff analysis. I want people to stop writing Verilog and algorithms, and then use a tool chain that allows them to converge toward silicon in a way that avoids all the issues you deal with today.
Gianfagna: You’re describing a top-down design methodology that comprehends hardware-software co-design, partitioning and physical implementation issues, and which balances it from the algorithm all the way through. That’s a great vision. But an alternative vision is that it’s too hard to do that. What if you come up with a hardware-based design flow that targets a large market with the ability for customization in software, and then you build a chip to address that? Now the co-design problem becomes, ‘Which architecture is most compatible with my software?’ I can just use that chip and customize the software. We’ve been predicting this for a long time, namely that all the differentiation becomes software.
Brambilla: We do have some progress in that direction. I see it as an intelligent way of attacking certain markets. I don’t see it in the switching market or cell phones.
Marvell designs a chip set, throws functions into a chip set, and they give it to Nokia or whoever they like.
Janac: Their volume is just barely enough to stay in that business.
Gianfagna: MediaTek has a similar strategy and they’re selling into the Chinese market.
Janac: But their stuff is highly optimized.
Gianfagna: That’s true. But the cell-phone market and the smart-phone application are very similar. We have 3G, 4G and a way to deliver the video. We have Wi-Fi. That all gets standardized. So the way that ‘Vendor A’ differentiates itself from ‘Vendor B’ is the software interface and maybe some clever stuff with touch screens. It’s more mechanical.
Brambilla: In that space I agree with you.
Janac: I don’t. One of the things that’s happening is we are in a computing architecture switch, from PC server to the cloud. What people have gotten wrong is those edge devices will need to become extremely sophisticated. The cloud will not always be available and you will need that sophistication to take advantage of the information that’s in the cloud. So those devices are going to go through a huge amount of innovation and become way more powerful than today. It may take several years but it will happen.
DeLaCruz: If you’re very highly standardized, you can probably program software to make some tradeoffs for you. When you’re dealing with a wider range of chips with analog content and some interface into memory you’re dealing with very different problems. I don’t think I would trust an EDA tool vendor to think of all these different options. They’ll implement certain things, but they’re going to be behind the curve by at least a year.
Janac: With the physical layout the tools were driven by design rules. But at the architectural level you really need IP. Without IP the tools do not have any reality. We’re going to see a combination of tools and IP at the architectural level. Without IP, ESL is a $50 million market. On the other hand, if you have the tools and the IP you can generate a lot of value. ARM cores will come with tools. Our interconnect will come with tools. The memory controllers will have tools. You’re going to see a unification of IP and EDA at the architectural level.
DeLaCruz: At that point in time the only options you’re presenting yourself with are the ones the IP vendors are giving you. It’s limited. But stepping back and taking a higher-level view, there may be a different way of looking at this problem.
Janac: The economics are forcing each company to build its own IP that’s core to its value. Otherwise it’s too hard to be too competitive across all IP and 60 subsystems. You have to pick from a menu of IP to build those parts of the chip that economics don’t allow you build yourself.