Why these three things are related and what it means for Intel and the rest of the industry.
If anyone has doubts about the slowdown and increasing irrelevance of Moore’s Law, Intel’s official unveiling of its advanced packaging strategy should leave little doubt. Inertia has ended and the roadmap is being rewritten.
Intel’s discussion of advanced packaging is nothing new. The company has been public about its intentions for years, and started dropping hints back when Pat Gelsinger was general manager of Intel’s Digital Enterprise Group. (Gelsinger left Intel in 2009.) Others inside of Intel have discussed packaging plans and advancements since then. The company’s purchase of NoC vendor NetSpeed Systems in September was the glue to make all of these pieces work together.
Intel has been collecting and developing those puzzle pieces for years. The purchase of Altera in 2015 allowed it to add programmability into designs. It also rolled out a die-to-die bridge (Embedded Multi-die Interconnect Bridge, aka EMIB) in 2016. And it has made investments in new memory types such as SSDs (Optane) and phase-change memory (3D XPoint), which potentially could replace L3 cache. All of these moves show just how serious and methodical Intel has been about this whole effort. And while the company has made some very high profile mistakes over the years, such as missing the entire mobile market trend, it has been remarkably consistent about how to continue reaping performance and power benefits from processors.
But all of this is being accelerated now for a couple of main reasons. One involves the power/performance impact of security threats. Speculative execution and branch prediction, two very effective ways of speeding up processors, create security vulnerabilities in hardware. Closing up those vulnerabilities causes a performance hit.
Intel isn’t alone in this. All of the established processor and processor IP companies have been scrambling to close up these security holes. Yet Intel was particularly hard hit because its largest customers—data centers—run their businesses based on performance per watt. A 10% loss of performance translates into added costs, because data centers must add more servers to run the same workloads at the same speed. It also takes more energy to power up and cool those additional servers. And in places like New York, where there is a ceiling on electricity generation and commercial real estate prices are high, that’s not a pleasant discussion to have with your customers.
Second, the benefits of scaling are dwindling. Samsung says that improvements per node after 7nm will be in the range of 20%, and not all of that will come from scaling. While any improvement is still attractive, it may not be enough to warrant regular upgrades by customers. That needs to be supplemented by other improvements, and the most likely sources are architectural and packaging, which is just beginning to mature. Fan-outs, 2.5D and even 3D designs are in use today across a variety of high-volume and niche markets, and the benefits in terms of performance and lower power are proven. The remaining issues are cost and design time, and both of those are being addressed with more flexible platform types of approaches such as chiplets.
Coincidentally and serendipitously, AI is suddenly showing up everywhere, spurred by machine-generated algorithms and the economics of machines doing some things better than people. This is yet another driver of high-performance, low-power design, and it’s a brand new application for which there is no precedent.
What’s not clear yet how chips will be architected to harness AI/ML/DL. While the basic physics of moving data around—or design changes to process more data without moving it—are well understood, the use cases for making this efficient are still evolving. It’s one thing to build a chip that can handle massive data throughput. It’s quite another to do it efficiently. A key problem there is generating enough data to keep all of the processing elements on that chip busy all the time or, alternatively, sizing the chip appropriately.
There are other stumbling blocks, as well. Some processors work better for certain algorithms and data types than others. But because this field is so new, the algorithms are in a state of almost constant change, it’s difficult to design a processor that will work optimally for any significant period of time. Some level of programmability needs to be added into the mix, and architectures need to be flexible enough to handle these changes.
Put all of these factors together and it brings Intel’s recent announcements into focus. Still, these changes reach well beyond just Intel. Intel is a bellwether. But the whole chip world is changing, and the impact on both power and performance across a wide range of applications will be significant and long-lasting.
Leave a Reply