Experts At The Table: Pain Points

Second of three parts: Stacked die; the limitations of existing tools; making software more efficient; predicting the future in long design cycles.


By Ed Sperling
Low-Power/High-Performance Engineering sat down with Vinod Kariat, a Cadence fellow; Premal Buch, vice president of software engineering at Altera; Vic Kulkarni, general manager of Apache Design; Bernard Murphy, CTO at Atrenta, and Laurent Moll, CTO at Arteris. What follows are excerpts of that conversation.

LPHP: With stacked die it’s no longer one company making an SoC. What does it mean for design, the tools and the ecosystem as a whole?
Kariat: There is definitely a trend toward multi-chip modules and 3D ICs. Some of it is because the economics make sense. You can design certain parts of the system at a much cheaper technology node and get much better yield. You can package them together and still fit them together in a form factor that works. That trend should show up more and more. Once you put them on the same process technology, you’re paying for the real estate so you want to use whatever you can, which is harder.
Kulkarni: Whether it’s 3D or 2.5D, in every device we use there are already multidimensional ICs. The customers we’re talking to are working with logic and memory, not logic on logic. So how do you manage all of this across the chip and through the interconnects? We see multi-domain analysis for the next generation of tools. The same tools will not be able to cut it. So to push the next technology node, if you have iTunes and Facebook and other applications, all of them take advantage of different circuits in a processor. The tasks have to be highly optimized. You don’t need to push to 20nm to listen to music. At the same time, if you are doing video streaming, you’re probably going to need the most advanced node. It will require a different profile in terms of high-speed buses, how things are integrated, and how they communicate with the outside world. People have to start looking at application-specific approaches where the software will drive how the hardware is architected as opposed to creating hardware and writing the software applications to that. As solution providers, we have to figure out how to deal with a higher and higher level of software control and see how we add value to that chain, from software architecture to hardware architecture to physical to the chip, the package, the subsystems, and so on. And then we need to look at EMI. As an industry, we have to go beyond the chip and look at a simulation-driven product design.
Buch: People are going down the wrong path with concern about tools. 3D is not necessarily a place-and-route and implementation issue. Some of that will be needed there, of course. The bigger issue, though, will be how you divide it up for power, for performance, and whether you use a different node or the same node Process estimates and power estimates are key. Once you figure out you want to do it this way, you can take any existing partitioning tool and, with some tweaks, make it work. I don’t see many people talking about TSV as a system-modeling problem.

LPHP: Aren’t the big problems how you integrate and physically put all of this stuff together?
Moll: In many ways, 3D is just a way of partitioning. The partitioning has existed for awhile. A particular subsystem may have its own power island, its own power supply, and it may be dark silicon most of the time. When you’re playing audio your video may be off. The GPU guys get to use the low Vt’s because they need megahertz while the other guys don’t. They already look like different chiplets. In some ways it is the same problem. It’s about integrating all of these things. Do we write software and ship drivers for all the pieces? In an old system, the guy doing the system could assemble it by buying a CPU from here, a GPU from there. Today you’ve got the power, the EMI, the software, the methodology for verification, the methodology for modeling, so when the chip gets taped out someone knows what it does. This is what we do. It’s a big integration issue from the stack of software all the way to the assembly of the IP and how it’s going to work.
Murphy: But how many degrees of freedom do you really need in the hardware? It’s going to cost $100 million or $200 million or $500 million to produce a chip. At some point it doesn’t make sense anymore. It’s simpler to solve the problem in software with better applications. If it’s going to cost $500 million to build it, it has to be able to target a multi-billion-dollar market.
Kariat: That’s already happening. If you look at the SoC’s that go into mobile phones, you have a system and you’re doing a lot of things in software. But one thing that we’ve seen over the last 12 months is a lot more companies jumping on the train to the next node than we have ever seen. We expected the microprocessor companies to be there and the FPGA companies to come right after them, because they need lots of area. Then you see the GPU companies, and then everyone else waits. That’s not what’s happening. We’re seeing the initial people come in, and then right behind them companies are moving in that you wouldn’t expect to be there this early. Power and integration drive that. The reason they’re all moving there is they all want to be in that little device, and that little device is packing in more functionality. Contrary to what we might think about not needing any more horsepower, there are a lot of applications we still can’t do on mobile devices. We can do more computing with lower power, but even if you look at something like voice recognition there is a lot more we can do with voice recognition that the hardware cannot enable.
Moll: The Internet is catching up. There’s a massive influx of people who want to participate in the whole mobility change, whether that’s cell phones, devices like tablets. Wireless infrastructure is a hot topic because everyone needs infrastructure. There’s also the cloud. This is creating a mini gold rush for these processors. They all want to participate and be competitive, and the only way to do this is with the latest and greatest from everybody.
Kulkarni: In the past people would say they had to meet timing performance. Now they say they have to meet the power budget. Many senior directors are saying they want the highest performance possible, but it has to meet a power budget. And that budget is for any range of applications.

LPHP: Isn’t it even beyond the chip? Aren’t we talking about the power budget for an entire device?
Kulkarni: Yes, it’s the system. It’s the subsystem and the system.
Murphy: HP has a program called Project Moonshot. It can collapse 100 blades down to a single blade. You can’t do that unless you can have a much lower power footprint.

LPHP: Does this happen faster at the data center/cloud level because they’re less price-sensitive?
Kulkarni: They started the trend that got transferred to the handheld mobile devices. Now it’s blending.
Murphy: It’s very much driven by the cloud. The whole social media revolution has created a huge market, but the infrastructure is not there today to handle all that traffic. It’s a different problem from the handsets. You don’t necessarily need a lot of compute power to handle Facebook traffic. You just need a lot of CPUs.

LPHP: There are lots of different types of software—software that controls things, embedded software, middleware, operating systems and applications. Who’s responsible for making sure that works well in the future?
Kariat: An abstraction layer is going to exist. The paradigm is to extract what you do on the hardware. At the application level you don’t want to deal with any of this stuff. People are going to higher and higher levels for databases and database transactions.
Kulkarni: Just like you measure hardware, with any application software there should be a power meter so any application you know the quality of that software. That kind of ‘Green’ stamp may become standard. A lot of people are talking about that—how to evaluate application software. Software is causing a lot of headaches. Even the big guys like Google and Microsoft are designing low-power software, doing dynamic scaling of that as if you were doing the same thing with the hardware. The software will be measured against the power it consumes. That’s a very important trend. It puts the software developers on the hook just like the hardware developers.
Buch: If you look at all the system-on-chips the question is how you address a system. You need all this stuff to boot up your processor core. Then you need all this programming. We have been looking at OpenCL applications where you can go from that to RTL, which can then by synthesized onto the FPGA. But you still need the ability to figure out what goes into software and what goes in hardware, starting with a high-level description. That’s a very interesting problem. In the coming few years we are going to move to an SoC platform where there will be some programmable logic where you draw your software differentiation and interconnects and program at a high level for a wide segment of the market.
Moll: To some extent, when you’re starting to design a chip, you’re designing for three years from now. If you’re designing for Android, you know what they’re going to do in six months or a year. But if you’re designing a chip for three years from now, what do you do? To actually create all of the hardware and the software the OS guys rely on you need to have a fairly educated guess about where things are going.