New processors will be blazing fast, but that doesn’t guarantee improvements in system speed.
New processor architectures are being developed that can provide two to three orders of magnitude improvement in performance. The question now is whether the performance in systems will be anything close to the processor benchmarks.
Most of these processors doing one thing very well. They handle specific data types and can accelerate the multiply-accumulate functions for algorithms by distributing the processing across multiple processing elements in a chip. In effect, they are parallelizing operations while also pruning algorithms and adjusting the output to whatever precision level is necessary, and they are storing and retrieving bits from memory in multiple directions rather than just left-to-right.
There are several inherent challenges with these architectures, though. First, moving data through a chip is not particularly energy-efficient if you aren’t also reducing the amount of data that needs to be processed along the way. While MAC functions can be distributed across multiple accelerators, moving that data across a chip and putting it back together in some coherent form after it has been processed by multiple processing elements isn’t so simple.
Data centers have been wrestling with this problem for decades, and hyperscale clouds have added an element of heterogeneity to the mix. The cloud essentially load-balances processing and uses high-speed optical interconnects to ship data around at the speed of light. But as this kind of operation moves closer to the data, such as in a car or an edge cloud, the ability to load balance is much more limited. There is finite real estate in an edge cloud, and far less in an autonomous vehicle. Moreover, it’s not at all clear that any of these systems will have consistent enough data feeds to remain on all the time, and many of the new processor architectures are being developed as always-on designs. They are most efficient when they are processing data at maximum speed.
The second challenge is that it’s not obvious how devices will be connected throughout the edge, and that can provide its own bottlenecks. Designing a system with a data pipe running at 100Gbps or 400Gbps looks very good on paper, but the speeds are only as fast as the slowest component on the network. Anyone with a gigabit Internet connection knows the flow of data is only as fast as the server on the other end.
Third, the economics of any compute operation that isn’t a hyperscale cloud data center are very fuzzy. Because these new processor architectures are highly tuned for certain algorithms or data types, they are unlikely to achieve the economies of scale that have made this kind of processing possible in the first place. Moreover, performance can degrade as algorithms are updated, and that is a constant process. That means the chips may have to be replaced more frequently than in the past. But if the cost of designs is too high due to insufficient volume, that changes the economics of upgrades.
Put in perspective, there are multiple potential bottlenecks as compute models shift, both from a technology and a business perspective. And that doesn’t even begin to address the ability to get data in and out of memory fast enough to keep up with processing. The shift to new processor architectures comes with a lot of moving parts, and so far there isn’t a whole lot of visibility into how all of those pieces will work together.
Leave a Reply