Accelerators Everywhere. Now What?

An explosion in data will require a massive amount of hardware and software engineering.


It’s a good time to be a data scientist, but it’s about to become much more challenging for software and hardware engineers.

Understanding the different types and how data flows is the next path forward in system design. As the number of sources of data rises, creating exponential spikes in the volume of data, entirely new approaches to computing will be required. The problem is understanding what data gets processed where, and devising ways to move that data only when necessary and preferably over as short a distance as possible.

For chipmakers and systems companies, this isn’t a question of how to process that data. There are plenty of processors and accelerators available today, with new ones under development by several dozen new companies. Alongside of those are new memories, which can be used for an extra level of cache, along with existing memories and storage types.

The real issue is how to keep those accelerators and processors primed and fully employed. Idle time costs power and area, and rapid startup and shutdown causes premature aging of digital circuitry, particularly at advanced nodes. Turning devices on and off seems like a good idea at 90 and 65nm, when scaling made multiple cores a necessity. But that rush of current is hard on digital circuitry, particularly at 16/14/10/7nm, where the dielectrics are getting thinner.

It also doesn’t help that leakage current is increasing again with 7nm finFETs. Static leakage became too significant to ignore at 45nm in planar processes without high-k/metal gate, and that only got worse at 28nm. This was particularly bad for anything with a battery, because leakage basically meant that “off” was a relative term. Batteries would continue to discharge even if a device wasn’t being used. Since then, the foundries have done much to alleviate this issue at the planar nodes, adding a variety of techniques and new materials, such as FD-SOI. And leading-edge chipmakers got around that problem with the introduction of finFETs. But finFETs are starting to leak, which is why gate-all-around FETs (horizontal and vertical nanowires and nanosheets) are on the roadmap at 5/3nm.

Put simply, the best option is to keep circuitry in constant use, and from an architectural standpoint that’s a challenge because no one is quite sure exactly how much data will be moving through these systems, where various processors will be used, and whether that data can be cleaned up so there is less processing required.

Overbuilding is an expensive proposition on many fronts, as large telecommunications companies discovered in the 1990s. They installed enormous amounts of optical fiber in the late 1990s in preparation for the dot-com explosion. Things didn’t work out as planned. The economy tanked, leaving most of that fiber dark for years to come.

So what comes next remains a mystery at this point. It’s clear something will need to change, because there are too many sensors generating different types of data in real time to do things the old way. That presents challenges for chipmakers, because general-purpose processors are inefficient. And while specialized accelerators can help, that approach works best if the type, amount and flow of data are well understood.

This is where software engineers come into the picture. They will be the ones who will need to create what amounts to intelligent middleware in order to load-balance and prioritize all of these interactions. This isn’t like writing application software based on an API. It requires a deep understanding of the movement of different types of data at the bare-metal level, as well as how to manage the data flow through new or existing hooks in the OS or RTOS. In short, it’s a job that requires deep knowledge of both hardware and software, because today (and possibly for some time to come) compilers don’t exist for this kind of stuff.

At any inflection point, there are always unknowns. But in this case, there are unknowns on multiple fronts, and at least some of them will need to be solved together. That makes the job significantly more interesting, even if it is more challenging.

Leave a Reply

(Note: This name will be displayed publicly)