The Rise Of Parallelism

After decades of failing to live up to expectations, massively parallel systems are getting serious attention.


Parallel computing is an idea whose time has finally come, but not for the obvious reasons.

Parallelism is a computer science concept that is older Moore’s Law. In fact, it first appeared in print in a 1958 IBM research memo, in which John Cocke, a mathematician, and Daniel Slotnick, a computer scientist, discussed parallelism in numerical calculations. That was followed eight years later by the first taxonomy, which was developed by Stanford computer science professor Michael J. Flynn (not to be confused with the former U.S. national security adviser). Flynn’s Taxonomy became the basis for how instructions could be classified, with each of the four quadrants showing one of two possible states.

Fig. 1: Flynn’s Taxonomy. Source: Lawrence Livermore National Laboratory

Since then, parallelism has become widely used inside of corporate data centers, primarily because of the initial success of databases that were built to parse the computing across different processors. In the 1990s, entire business processes were layered across these underpinnings as ERP—enterprise resource planning applications. Parallelism also found a ready market in image and video processing, and in scientific calculations.

But for the most part, it bypassed the rest of the computing world. As the mainframe evolved into the PC and ultimately into the smartphone and tablet/phablet, it made only limited gains. Multithreading was added into applications as more cores were added. Still, those extra cores saw only limited use, and not for lack of trying. Parallel programming languages came and went, extra cores sat idle, and software compilers and programmers still approached problems serially rather than in parallel.

The tide is turning, though, for several reasons. First, performance gains due to Moore’s Law are slowing. It’s becoming much harder to turn up the clock frequency on smaller transistors because dynamic power density will burn up the chips. The only ways around that are advanced packaging, better utilization of more cores, or advanced microfluidic cooling systems, which would greatly add to the cost of developing and manufacturing of chips.

Second, and perhaps more important, the “next big things” are new markets with heavy mathematical underpinnings—virtual/augmented reality, cloud computing, embedded vision, neural networks/artificial intelligence, and some IoT applications. Unlike personal computing applications, all of these are be built using algorithms that can be parsed across multiple cores or processing elements, and all of them require the kind of performance that used to be considered the realm of supercomputing.

Third, the infrastructure for using heterogeneous components to work in unison is developed enough for sharing processing across multiple compute elements. That includes on-chip and off-chip networks; real, virtual and proxy caching; and new memory types and configurations that can handle more data per millisecond.

Add up these factors and others, and for the first time in years the future for parallelism outside of its traditional markets is beginning to brighten. This is evident in the EDA market, where many of the performance gains in the new versions of tools are based on highly parallel and in some cases, notably emulation and specialized simulation, massively parallel architectures.

But for the rest of the computing world, the benefits of parallelism are just beginning to show up. And after years of predicting parallelism was just around the corner, proponents of this approach may finally be proven right.

Related Stories
Heterogeneous Multi-Core Headaches
Using different processors in a system makes sense for power and performance, but it’s making cache coherency much more difficult.
The Limits Of Parallelism
Tools and methodologies have improved, but problems persist.
Tuning Heterogeneous SoCs
Just adding more cores doesn’t guarantee better performance or lower power.
Time For Massively Parallel Testing
Increasing demand for system-level testing brings changes.


Brian Bailey says:

Flynn’s taxonomy badly needs redefining. It was based on the assumption of a Von Neumann architecture. At the very least, instruction has to be replaced by operation and it also needs to include notions of granularity. We have generally separated those into distributed and parallel, but the distinction is fuzzy at best.

Leave a Reply

(Note: This name will be displayed publicly)