Is Computing Facing An Energy Crisis?

Efficiency gains can be boosted by combining CPUs, NPUs, GPUs and networking processors in novel ways.

popularity

Is the end near?

If the topic is energy efficiency gains in computing, the answer depends on whom you ask.

The steady increase in performance per watt over the decades has been one of the most important drivers in our industry. Last year I was thumbing through a neighbor’s 1967 Motorola IC catalog that featured such space age wonders as a small control chip of the sort that went into the Apollo moon mission. While cutting edge then, if you tried to build a smartphone with it today, the phone would consume about 16MW of power and take up 12 football fields. You’d think twice before signing up for a cell plan.

Skeptics believe we are headed for choppier waters. Moore’s Law is delivering diminishing returns. Meanwhile, techniques that have kept data center power consumption flat for the past 15 years —virtualization, ambient cooling, workload consolidation, unplugging “zombie” servers—have already been exploited fairly extensively. Many cutting edge data centers already tout Price Use Effectiveness (PUE) ratings of close to 1, meaning that almost all of the energy goes to running IT equipment. Further improvements will require innovation of core computing architecture.

Worse, AI will turn up the heat. We’re graduating from basic AI problems (finding cat videos!) to more energy-intensive tasks like autonomous driving or medical diagnostics. Applied Materials warns that, absent advances in materials, chip designs and algorithms, data center power could rise from 2% of worldwide electricity consumption to 10% or even 15%.

On the other hand, the optimists have a compelling argument: we’ve heard it before. In 1999 some predicted the Internet might consume half of the grid in ten years. That scary future was avoided through leapfrog innovations like FinFETs, but also through steady improvements in overall system design and mapping algorithms to hardware. Good engineering, they argue, still has quite a bit of headroom.

Plus, you need to look at the big picture. Worldwide emissions dropped by 2.4 billion tons, or 7%, in 2020 as videoconferencing replaced commuting and business trips. While travel will likely rebound, a good portion of meetings will stay on Zoom. Similarly, smart devices and AI are being deployed to help curb the estimated 30% of power that gets wasted in buildings. Electronics, one can argue, can deliver a net benefit the environment.

Nonetheless, many optimists are also reluctant to look beyond a 2 to 3 year horizon. So who’s right? Both sides bring up very good points and the debate has certainly added a jolt to conference panels. But personally, I’m a cautious optimist. While Moore’s Law may be past its prime, the semiconductor industry has already launched into a design-centric era where gains will be mainly realized through innovations in SoC and core architectures instead of process shrinks. Large integrated caches and GPUs accelerators were arguably the first step in this era. 3D NAND was another major milestone: transistor stacking changed the design and economic equations for flash memory companies.

At Arm, we’ve been paying particular focus on exploring the synergies that can be achieved by combining CPUs, NPUs, GPUs and networking processors or DPUs in novel ways. Combining CPUs and NPUs, for instance, have been shown to be capable of boosting efficiency gains by 25x while increasing performance on tasks like interference by 50x over CPU-only solutions. For IoT devices, that means an ability to produce more precise, more interesting insights on a fixed energy budget that won’t tax batteries. You’ll see a similar philosophy with the Total Compute strategy coming to handhelds.

In data centers, AWS says its single-threaded, Arm-based 64-core Graviton2 processor provides more than 3X the performance per watt over more traditional multithreaded processors with fewer cores. Similarly, AWS says that over 70% of the instances available on EC2 take advantage of its Nitro system for offloading tasks like virtualization, security and networking to dedicated hardware and Arm-based silicon.

One of the next big milestones for us all will be the commercialization of chiplets. Chiplet designs allow companies to maximize yields and mix process manufacturing nodes for optimal effect. Chiplet designs, however, will also have a positive impact on the power-performance equation. Imagine a 4 x 4 array of chiplets each with 640 CPUs, 640 NPUs, and gigabytes of SLC all linked by a high-speed interconnect. Such a system could deliver petaflops of performance on around 1.4kW of power.

And what do we do when we tap out the gains there? Dig deeper with chip-level technologies like in-memory computing: Over 60% of total system energy gets spent moving data between main memory and compute by some estimates. We’ve only scratched the surface of what is possible at the device and circuit level.

Granted, these advances will take some very hard work, but I’m confident they can occur before we hit a power wall.

Where do you believe the future will go? Feedback, comments, and ideas are quite welcome.



1 comments

Santosh Kurinec says:

Very well articulated Rob. For the current benchmark energy per bit, 1E-14 J/bit, computing will not be sustainable by 2040. This is when the energy required for computing is estimated to exceed the estimated world’s energy production. Significant improvement in the energy efficiency of computing is needed with development of in-memory computing, and neuromorphic computing. The amount of raw generated data is growing at an exponential rate due to the greatly increasing number of sensors in electronic systems. While the majority of this data is never used, it is often kept for cases such as failure analysis. As such, archival memory storage, where data can be stored at an extremely high density at the cost of read latency, is becoming more popular than ever for long term storage. Here, DNA memory offers potential.

Leave a Reply


(Note: This name will be displayed publicly)