Blazing-Fast Performance

A look inside the world’s fastest computer, and why it now matters to many more people.


When it comes to raw performance, there’s nothing like a supercomputer. Until recently, though, most of this was simply bragging rights about whose supercomputer was faster. A trillion calculations (petaflop), more or less, doesn’t mean that much outside of scientific circles.

What’s changing is that companies and governments now can utilize these blazing fast machines across a wider swath of applications, from AI to cloud computing. And they will be connected to more devices in more places, including commercial operations.

Consider the new Summit supercomputer at the U.S. Department of Energy’s Oak Ridge National Laboratory, which at the moment holds the world’s speed record. Within its liquid-cooled racks are 4,608 separate servers, each of which contains two 22-core IBM Power9 processor and 6 Nvidia GPU accelerators. Those servers collectively have 10 petabytes of memory, with high-bandwidth data pathways and dual-rail 100Gbps Infiniband connections between servers (Infiniband is expected to increase to 250Gbps within the next five years). In short, it’s a monster of a system that is capable of 200 quadrillion calculations per second.

Fig. 1: Connecting Summit’s servers. Source: IBM

Oak Ridge announced that it plans to use the machine for cancer surveillance, for developing new materials by simulating hundreds rather than tens of atoms, and to simulate supernova scenarios in astrophysics that last several thousand times longer than what could be done in the past. It also will be using the machines for AI to identify patterns in human proteins and cellular systems.

What’s changed here is that supercomputers are no longer relegated to academic research. They are now becoming an integral part of the cloud architecture, which most likely will be supplemented by quantum computing over the next decade. And while many of these cloud operations will be owned by large companies such as Amazon or Google, increasingly they may be connected to government supercomputers around the world. That, in turn, can have significant implications for how edge devices are designed and function.

Data processing always has been about reducing bottlenecks for whatever price the market will bear. At 10/7nm and beyond, the big problem is moving signals across a die because the signals are subject to RC delay in increasingly skinny wires. This has spawned interest in a slew of advanced packaging options, as well as a trend toward dis-integration, whereby different processing elements and memories are scattered around a chip or on multiple chips rather than trying to cram everything into a single processing element.

Whether various components such as accelerators, analog IP blocks and memories continue to drift further apart, or whether they ultimately come together in a 3D stacked die isn’t clear yet. Much of that will depend upon the ability to control heat using new transistor structures, such as gate-all-around FETs, which can reduce leakage, and the cost of developing and manufacturing those new devices.

But these issues do provide an interesting opening for these giant supercomputers to play a much larger role for a variety of devices, particularly if they can be connected using extremely high-bandwidth connections such as millimeter-wave 5G. That won’t work for autonomous cars moving down the highway at 100kph or more, but it certainly could work on a corporate campus where a delay of 100 milliseconds is still faster than what most workers experience today. And that could impact what kind of processing is done locally, and how much, versus how much is done remotely.

Supercomputers have suddenly become very relevant to many more types of processing and compute architectures, and that relevance will continue to grow the CapEx investment for cloud-based computing continues to rise beyond the means of many companies.

Fig. 2: Oak Ridge’s Summit. Source: U.S. Department of Energy

Leave a Reply