The Real Value Of Digital Horsepower

Raw performance in electronics is getting harder to prove. That’s a problem.


Chipmakers and systems vendors are beginning to experiment with a slew of new ways to beef up performance and reduce power and area, now that shrinking features no longer guarantees those improvements.

The number of new ideas introduced at industry conferences in the past few months is almost mind-boggling. Just on the CPU side there are new architectures that improve the amount of work that can be done per compute cycle, with pre-fetch capabilities similar to what search companies have been doing to speed up results. The difference is that this is being done inside the processor, taking big data techniques and applying them in more confined spaces.

There are faster interconnects and improvements in how to move data more efficiently, including heterogeneous caching schemes, rethinking where and how processing is done on data, and photonics. And there is work underway to increase the density of everything from RAM to DRAM, as well as how to sharply lower the amount of power used per operation and per function.

In the history of semiconductor design, this may well be remembered as the most prolific and innovative era in history. The big question now is how long all of this will continue. The reduction in benefits from device scaling has been obvious since 28nm, which is why there are so many options being considered. But developing chips with this many changes is expensive, and the semiconductor industry has always set the bar for how to squeeze efficiencies out of processes and materials.

The problem is that because there is such diversity among the technologies, it’s not possible to compare one with another. So how do companies get paid for their efforts? For an end device company such as Apple or Samsung, or Tesla or Ford, it’s rather easy. The battery in a phone lasts for so many hours without a charge, or a car goes from 0 to 60 in a certain period of time on a single charge or some combination of charge and gas.

But how do you determine the value of shrinking a self-driving compute system into the size of a briefcase, as Google has done? And how do you compare one processor versus another, when each is approaching bottlenecks and accelerating performance in such different ways.

The challenge increases yet again with specific applications and functions that are being accelerated for specific use cases. Unlike in the past, where a chip was measured on MIPS/BIPS or MHz/GHz or operations per watt, these applications are user-specific. And in the case of machine learning, how technology gets used post-silicon can alter the fundamental characteristics of the system. The comparison may be less about manufactured specs than the ability to learn new things.

And finally, there is nearly total confusion today about what constitutes a process node, particularly after 28nm. What is a 16/14nm finFET based on a 20nm BEOL process? Is it a 20nm, which is arguably a failed node, with a new coat of paint? Or is it a 16/14nm process with some legacy process technology? And how does that compare with Intel’s 14nm, which has a 14nm BEOL?

Having more processing power or lower power are nice to have, but getting paid for innovation is essential. That will require a consistent set of metrics that extend well beyond what has been used in the past. And given the amount of effort being put into chips and systems these days, these metrics cannot be developed too soon.