Last of three parts: Connecting software and hardware teams, creating useful power models, process improvements
By Ed Sperling
Low-Power Engineering sat to discuss progress in the realm of power management with Ambrose Low, director of IC Design Engineering for Broadcom’s mobile platforms group; Ruggero Castagnetti, distinguished engineer at LSI, and Andy Brotman, vice president of design infrastructure at GlobalFoundries. What follows are excerpts of that conversation.
LPE: Has there been any progress in providing a feedback channel for software development to link that with the hardware design effort?
Low: Our software team is larger than our hardware team now. The consumer and the market are driving that.
Castagnetti: Part of that has to do with whether there are good models. One issue is how you predict power consumption. If you write code this way or that way, how good is that number compared to the end number? That’s part of the issue.
LPE: And if you have your cell phone on and you’re searching for base stations, you’re consuming power. Accelerators are running in the background and not even showing up on your dark screen. But is this even being considered by developers?
Low: We certainly have the use case model for this scenario.
LPE: The foundries have a super low-power process. What’s different there?
Brotman: Pieces of that are fundamental engineering—engineering the channels for low energy and high k/metal gate structures. Those are all part of putting together a transistor that is low power in operation and low leakage when it’s in standby. On top of that we have reference flows in place to do power predictions on the devices.
LPE: So how much will the process save us over the next couple nodes?
Brotman: We’re going to see improvements, but as we push down in process nodes we’re also going to have penalties. To some extent these things partially negate each other. We get improvements from node to node. High k/metal gate is showing improvement going from 45nm to 32. High k at 45nm is less leaky than at 28nm.
Castagnetti: The past improvements we’ve seen will never offset the designs again as we double the number of transistors and increase data rates.
Low: As we move to 22nm/20nm, the Vt level is decreasing. The EDA vendors have to make sure the flow is optimized for lower portable power.
LPE: What happens when we go into stacked die? Will that give us a big boost in power savings?
Brotman: 3D packaging will result in power advantages. Wide memory sitting on top of a graphics core will allow you to drive the power down. There will be lower parasitics and less inductance, which will allow you to drive that power down, as well. There are other things we need to take into account with 3D, though, such as temperature management. If we don’t deal with those, they can impact power in the wrong direction.
Low: When you stack the die you minimize the interconnect distance. That can help reduce the dynamic power. But thermals become a problem. As temperature increases, leakage increases, so you have to make sure that the temperature does not become a problem.
Castagnetti: The true 3D, which will offer a significant improvement in power savings, is quite far away. From a 2.5D approach, one of the key promises from a wide I/O is shorter latency. You will be able to push more data through. That’s a fundamental benefit.
LPE: How close are we to 2.5D?
Brotman: We’re pretty close. We’ll see the first ones quite soon.
LPE: Will we immediately see a power savings?
Brotman: It depends on the type of stacked device. If you’re doing memory on logic, we can cobble something together that will work. If we’re talking about logic on logic with different process technology, doing floor planning and partitioning will take more work.
Castagnetti: Even the decision process of how to partition and what to partition is not in place today. To make the best tradeoff from a power standpoint, what do you put in each node? That’s a tough nut to crack.
Low: First of all, we have to make sure we have the tools to put the design together. Second, we have to make sure we understand whether the design should be for 40nm or 28nm. What is the most cost- and performance-effective, while achieving time-to-market?
Leave a Reply