One-On-One: Dark Servers

Part 2: UC San Diego Michael Taylor talks about distributed approaches for dark silicon.


Professor Michael Taylor’s research group at UC San Diego is studying ways to exploit dark silicon to optimize circuit designs for energy efficiency. He spoke with Semiconductor Engineering about the post-Dennard scaling regime, energy efficiency from integrated circuits all the way up to data centers, and how the manufacturing side can help. What follows are excerpts of that conversation. To view part one of this interview, click here.

SE: Energy efficiency is an issue all the way from the individual integrated circuits to the mobile phone all the way out to the giant data center that has much more energy coming in, but also much more need for it, and so has to worry about cooling and so forth. Can these same principles be applied at higher levels of abstraction? So is it possible to have dark servers in a data center, just as you might have dark silicon on a transistor or on a circuit?

Taylor: Wow, that’s an interesting idea. So you are saying basically, maybe you have different servers that are customized for different kinds of computation?

SE: Right.

Taylor: Yeah, I think this is a very fascinating direction. One example, I guess you could say of this would be, it’s not quite — in a sense the darkness is not concentrated within one data center but it’s distributed across the world — so one example that is pretty interesting is bitcoin mining. We have entire data centers now that just have specialized chips that do bitcoin mining. The energy efficiency of those specialized chips is probably 200 times greater than the general-purpose chip that you might otherwise use.

SE: Right, because since you’re attempting to coin money, all of the costs that go into that reduce your profit margin.

Taylor: That’s right. If you model it, it’s like manufacturing. You have your NRE cost of buying the servers, but then you have the marginal cost of basically doing the computation to create the coins, and that cost is primarily energy, but I guess, also, data center costs. So the challenge is that if your chips are not energy-efficient enough to keep up with the other chips out there that are mining, in terms of their energy efficiency, then you will actually lose money with every coin that you mint.

SE: Right.

Taylor: So the push is on performance, but also energy efficiency. It’s the energy efficiency that determines when you shut that thing off because it is no longer profitable. So in essence, you were talking about the dark silicon kind of idea as applied to the data center level. Here, well, it is not like there’s one data center and they have their bitcoin mining section and their general-purpose section and they shut them off based on what they are doing. They are always mining the bitcoins. But on the other hand it is leveraging this kind of specialization. If you have a workload that you know you are going to do reliably every day, then at some point if that work load is large enough you might just go and build custom chips to do that and save yourself energy. It could be Google is scanning through all of the images in the world on the Internet and looking for faces. If they have enough of a need for that, at some point they would actually save money by building custom chips for it and maybe having a data center that just does face recognition.

SE: Which again gets to the tradeoff between the overhead of doing that and whatever the savings happen to be. It’s impossible to make general statements about that kind of tradeoff at either the data-center level or back at the integrated-circuit level because it’s dependent on the workload of this particular device, how consistent is it, how variable is it, whether it makes sense to use highly customized hardware or not.

Taylor: Yes. If you look at the field of computer architecture, most of the optimizations that they do are actually figuring out ways to exploit predictability in the computation. So whether it is caches or branch predictors, they’re observing that well. Most computations have this property, and so we will optimize so that if the computation has that property we will save lots of energy and be very high-performance. If it does not, we may actually burn more energy, but in the balance we come out ahead. So there’s this kind of idea that if a computation is predictable in some way, then we should be able to do it more efficiently than something that’s very unpredictable. If you think about different workloads and how predictable they are—one example would be of course the bitcoin mining workload, which is extremely predictable. If you’re Intel and you’re selling your server to everybody, then it’s probably somewhat unpredictable. Mobile phones might actually be pretty predictable. Although there are a million apps in the App store, there are only a couple of apps that billions of people use.

SE: Such as e-mail?

Taylor: Like their Web browser and Facebook.

SE: This was the GreenDroid proposal from your group, right?

Taylor: Yes, exactly.

SE: It said this fraction of operations account for this fraction of the total workload, so you can optimize around them, right?

Taylor: Yes, and that’s an interesting mentality for exploiting specialization—understanding your workload, understanding what is predictable and then thinking about how you can leverage that predictability in the form of specialization.

  • sftechie

    As a Data Centre expert and ISO/IEC standards developer, the customization and dedication of silicon design to dedicated functions is only efficient in specialized cases and not applicable in the general situations. The specialized cases are ONLY if the workload and resulting customization is well defined and does not change significantly. These cases have been exploited in the past using FPGA’s. The programmability portion allows for a certain level of change without changing the base commercial backbone of the computing structure. Once the workload expands, let’s say in the mining bitcoin case, that encryption, transaction, and secured interactions/transactions with other monetary systems are required. The dedicated design may be ill-equipped to efficiently support the new activity, spending too much time emulating the compute resources the dedicated silicon doesn’t have. Such an evolution would now require a different set of silicon, where the time to reconstruct and replace the compute infrastructure precludes this transition. In most cases, virtualization and more comprehensive underlying compute structure is more practical and energy efficient. With smaller compute demands (breadth and load) one can use a less capable base compute structure and customize/streamline the front-end. Again, in this case, further optimization would use FPGA’s for dedicated workloads but still allow a certain amount of expansion of the workload within the constraints of the underlying architecture. Currently the provisioning and configuration of these compute resources are relatively static. I believe the industry is working to build up the base compute structure, improve the flexibility in the base, and further the possibility of dynamic compute provisioning.

    There have been attempts to simplify and create many smaller compute nodes, but upon further review and the number of new workloads entering these systems, many IT managers have elected to postpone or shuttle these ideas.