Why time is becoming an essential element, and what it means for chip architectures.
The chip design world is no longer flat or static, and increasingly it’s no longer standardized.
Until 16/14nm, most design engineers viewed the world in two dimensions. Circuits were laid out along x and y axes, and everything was packed in between those two borders. The biggest problems were that nothing printed as neatly as the blueprint suggested, and current leaked out of two-dimensional gates to the point where “off” became a relative term. It was simply less “on” than “on.”
Once finFETs began entering the picture, everything changed. They plugged the current leak, at least temporarily (although leaks began popping up in other places, such as memory) and dynamic power density became so high that it could melt a chip unless some sophisticated thermal management was added.
FinFETs pushed transistor design into a third dimension, and then 3D NAND expanded everything into another dimension with advanced packaging. Prior to 3D NAND, the initial advanced packaging approaches were either a planar MCM, which was basically a tightly coupled PCB in a package, or a planar 2.5D, which Xilinx pioneered in order to improve yield.
The fourth dimension, which is just beginning to take root in design, involves reliability. It is quality over time, and much of that is being driven by safety-critical markets such as automotive, robotics, medical and aerospace.
Reliability in design always has been a consideration in some markets. IBM mainframe processors were designed to last for a decade or more, which was good for customers and for IBM, particularly in the days when mainframes carried a hefty price tag that included software and services. The fewer service calls IBM technicians had to make, the higher the company’s profits, so IBM designed its chips to exceed whatever use case demands they were likely to encounter.
Using semiconductors in mission-critical or safety-critical markets always has been somewhat limited, though, and until 28nm there always was the promise of the next node to fix problems. That’s no longer possible. Chips developed today at the most advanced nodes are extremely expensive, highly customized, and almost always involve some sort of complex packaging. If a company is spending several hundred million dollars on a design, the chip has to provide a significant improvement in performance, power, area/cost, or some other reason that is so compelling for a particular use case that it’s worth the investment.
For most companies, this no longer is a volume game where a chip is designed once and sold in quantities of 1 billion or more. That only worked for smart phones, and even the smart phone market is flattening and splintering. These costs now must be absorbed by a larger system design, and that system has to function long enough and close enough to spec over its projected lifetime to avoid repair or replacement costs. Not everything can be fixed with an over-the-air software update, and it can take hours of labor to replace a part embedded deep in an autonomous vehicle.
So now the focus has shifted to understanding not just how chips are supposed to work, but how they actually work in the real world under unexpected conditions. That data needs to be looped back into the manufacturing flow so that future iterations of those chips can avoid these problems. In addition, designs need to be modular enough to fix these problems, and to replace whatever components wear out along the way.
This sounds logical enough on paper, but it’s a whole different way of designing a chip. It was hard enough in two dimensions at 28nm, but it’s much more difficult in four dimensions at 7nm and 5nm.
Leave a Reply