Low Power-High Performance

One-On-One: Dark Possibilities

Part 3: Applying advanced power-saving techniques to established process nodes, and rethinking some of those techniques to improve cost and power efficiency.

April 29th, 2015 - By: Katherine Derbyshire

Professor Michael Taylor’s research group at UC San Diego is studying ways to exploit dark silicon to optimize circuit designs for energy efficiency. He spoke with Semiconductor Engineering about the post-Dennard scaling regime, energy efficiency from integrated circuits all the way up to data centers, and how the manufacturing side can help. What follows are excerpts of that conversation. (Part one can be found here, part two here.)

SE: If you’re Intel or Samsung or TSMC, you can invest in the latest and greatest manufacturing technologies and make everything extremely tiny and get the full benefits of scaling, whatever those benefits happen to be. But smaller companies don’t have that option, or they’re working through a foundry like TSMC, which puts another layer between them and the manufacturing side. Are those companies stuck, or do design tools for exploiting dark silicon give them a way to continue to improve their products even if they’re stuck on their ability to improve their manufacturing?

Taylor: There are a lot of interesting questions about how to create new innovation in the hardware space, especially with silicon. As researchers, at a U.S. institution, we’re always looking at the crystal ball and trying to figure out what’s going to happen at 7nm and how architecture is going to change and so forth. But we should be paying more attention to the older process generations, too. One really interesting example of this is for bitcoin mining chips. The folks who created the first bitcoin mining chips didn’t target the latest process generation. They basically picked the oldest generation that would allow them to beat the next best thing, which was an FPGA or a GPU. So in this case it was either 110nm or 130nm, and then that way they could pay very low NREs, they could get their product out on the market and test the viability of the idea. And then, when they saw that this was a good idea and that there was a market, they rapidly ramped up and now there are bitcoin miners being manufactured in 20nm.

There is this interesting ecosystem where, if you combine highly specialized chips with old process generations, that’s actually a really wonderful combination. It allows you to get a benefit over the chips that are out there today, but you can do it with lower start up costs, with a venture capitalist or an angel network. You basically can establish that there is a market. There is this idea in software of the lean startup. If you are in hardware and thinking about 14 nm, there’s nothing lean about that. It would take many years before your idea would even be tested in the marketplace. The idea of targeting the older processes could allow us to be more lean about trying out ideas in hardware space, energize it, and get students excited about doing startups—instead of thinking hardware should just be designed by four big companies. On the device side the question is whether it is possible to push some of the innovation that we’ve had in the later process generation back into the older process generation to make them more competitive, and yet not encounter all the manufacturing issues that we’ve had as we try to scale things down.

SE: And there have been some efforts on that with automation, for instance, because automation is feature-size independent. If you can move wafers around your fab faster, then you’re going to get benefits no matter what you’re actually putting on the wafer. With some of the process technologies it’s difficult because the older generations are very cost sensitive and so, if the equipment was built for the leading edge, then retrofitting it back to older generation equipment is hard. But building new equipment to serve the older generation is hard for the equipment makers to justify.

Taylor: There is this whole amortization model, too. If you are competing against everyone else and they already amortized their equipment and you are trying to use new equipment, then it’s going to make it difficult.

SE: Right, although, there is a large used-equipment market where, when Intel upgrades a fab, the equipment that was in that fab is then sold into the secondary market where it becomes a trailing edge fab.

Taylor: In my group we’ve been experimenting with this idea of older process generations and using today’s CAD tools to design on these old process generations. The quality of the results is just amazing compared to what the tools were doing 10 years ago. So even though you’re building an old process generation, if you look at frequency and energy efficiency, you’re able to get to those levels they did back then but with much, much less effort.

SE: So if you were to use a tool that’s capable of making 20nm features to make 80nm features, you would get sort of for free all of the advances in reducing particles, reducing contamination, reducing variability, right?

Taylor: Yes. And if you look at the dielectrics, what if you had some of these new dielectrics with some of the older processes? You would give them some new life in terms of energy.

SE: There you run into the equipment problem because the low-k dielectrics, the low-k inter-metal dielectrics, use a different equipment set than plain old boring silicon dioxide. You would need to again bring that equipment back to the older generation, which is something that’s not necessarily cost justified.

Taylor: I have some feedback from your previous article, where you were talking about less-volatile RAM technology. Another example would be ARM’s big.LITTLE. There are two clusters of four cores on the chip. Four of the cores are quite energy efficient but not as high performance, and the other four are higher performing. So it’s like the four high-performance ones are ARM A15s and the lower performance ones, they might be ARM A8s or something. What they do is when the user is clicking on something on their cell and need to blast some HTML to the screen, they might run on the A15 and do it very quickly, but the phone gets too hot, the chips get too hot, and then they will move the computation over to the other lower-powered cores. They move back and forth, so this problem of transferring the state back and forth and powering things down and powering things back up is an overhead, as well. The less volatile RAM potentially could be used in that case.

The other alternative to using less volatile would be to use retention flops. The idea for retention flops is that when you power gate the chip—when you power it down—you actually put the flip flop, the state element, on a different power grid. Then they are still holding their state and then, you know, the rest of the circuit can go to sleep. That would be the nearest competing existing technology to the less volatile RAM. But the less volatile RAM is very interesting and those flops are going to leak. And there is going to be overhead added, so you need a different power grid in there. The less volatile stuff is clearly useful and possibly better.

SE: Or for different applications. It would also depend on how much retention time you need, right?

Taylor: Exactly. My first foray into the device level was when I was in grad school. We built a chip in an IBM 180nm process technology. I was trying to compare it to Intel and we were comparing our chip to Intel’s technology and the big question came down to how does IBM’s A6 technology compared to Intel’s production microprocessor technology. I read hundreds of IEDM papers to try to understand differences, and that experience actually down the road gave me the skills to understand the Dennard scaling issue and dark silicon problems and to write papers about that and bring that set of problems and ideas to my research.

SE: It’s a shame that the design side and the manufacturing side are so complex individually because it’s really challenging having enough knowledge to bridge the two.

Taylor: When I do my first lecture in my classes I describe those abstraction layers and I show a slide that has them all stacked up. Although this is how we have organized, how we deal with this engineering problem, it’s also essentially organized the social network, too. So I will drink beer with the people above me and below me in the stack, but probably not the people three levels below me. Those are the conferences we attend and the concentration of people working on certain parts of the stack.

SE: And in each stack you have little subdivided silos, too. Lithographers don’t talk to etch people don’t talk to transistor implant people very much, right?

Taylor: Yes.

Katherine Derbyshire

(all posts)
Katherine Derbyshire is a technical editor at Semiconductor Engineering.

One-On-One: Dark Possibilities

Katherine Derbyshire

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

The Rising Price Of Power In Chips

The Future Of Memory

SRAM Scaling Issues, And What Comes Next

Backside Power Delivery Adds New Thermal Concerns

Glitch Power Issues Grow At Advanced Nodes

Rethinking Memory

Architecting Chips For High-Performance Computing

SRAM’s Role In Emerging Memories

Sponsors

Recent Comments

About

Navigation

Connect With Us

One-On-One: Dark Possibilities

Katherine Derbyshire

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

The Rising Price Of Power In Chips

The Future Of Memory

SRAM Scaling Issues, And What Comes Next

Backside Power Delivery Adds New Thermal Concerns

Glitch Power Issues Grow At Advanced Nodes

Rethinking Memory

Architecting Chips For High-Performance Computing

SRAM’s Role In Emerging Memories

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored