Factoring Reliability Into Chip Manufacturing

The general consensus is that semiconductors can be built to last for many years. The trick is how to spread out the cost.

popularity

Making chips that can last two decades is possible, even if it’s developed at advanced process nodes and is subject to extreme environmental conditions, such as under the hood of a car or on top of a light pole. But doing that at the same price point as chips that go into consumer electronics, which are designed to last two to four years, is a massively complex challenge.

Until a couple of years ago, this wasn’t a huge problem. Most of the devices developed for those markets were built on older-node technologies, where issues such as electromagnetic crosstalk and latent defects were rare. Even base station chips were at 28nm. If inspection and metrology missed a few potentially latent defects, that wasn’t a significant problem. Chances were good that the thickness of the wires and the insulation would continue to work, regardless of any inherent flaws caused by any single process, or combination of processes, in the fab.

In the past 18 months, all of this has changed. The rollout of 5G is a radical change in communications. AI/ML/DL are being introduced into just about everything. And a proliferation of sensors just about everywhere has made it impossible to move all data to the cloud, created a stampede toward the edge.

The result is that more of these devices require maximum density for both in logic and memory. So rather than developing a base station or automotive chip at 180nm or 250nm, they are now targeting 7nm and below. And all of them are supposed to work without a hiccup for decades.

It’s almost certain that costs will rise, because these chips will require more testing, inspection, atomic-level deposition and etch, more sophisticated fill, and much more attention paid to quality at the edge of a wafer. But there are ways to trim those costs as advanced processes become more mainstream.

First, more steps in the flow need to be done earlier, and they need to happen concurrently. This reaches well beyond the fab. It requires increasing the amount of simulation and lab bench testing, of course. But it also requires more attention to potential problems that may not show up in the fab’s design rule deck, which can involve signal interruption and noise, memory degradation, and considerations for parts of a chip that may be always in the on-state due to new use models. This is particularly true for chips in automotive, industrial and medical applications, where regular inputs are essential.

Inside the fab, this can include various types of testing and deeper inspection, as well as more monitoring as films are applied and structures are grown rather than inspecting them later. Those sound like logical additions, but they require significant changes to equipment, including a much more robust feedback loop between production data after a wafer has been processed and the production lines while they are running.

Second, more components need to be developed, tested and inspected independently. While scaling will continue, it will be one piece of the solution. This is happening today at advanced nodes. What’s missing are standardized interfaces and much better characterization of hard and soft IP. These are essential because it’s not always clear how or where IP will be used when it is developed, and creating additional margin isn’t always necessary. Case in point: Automotive (AEC-Q101) Level 0 is now 175ºC, versus the previous requirement of 150ºC for discrete semiconductors. There likely will be more shifts in that direction in an effort to boost reliability.

Finally, manufacturing data needs to be analyzed much more carefully and compared against in-circuit monitoring in the field. Post-production data can be used to flag a failure, which may take years to show up, and it can be fed back into the supply chain to figure out what went wrong in a particular manufacturing run. That could include everything from contamination in gases on a particular day to a speck of dust on a mirror in an EUV scanner. And all of this has to be looped back and applied to data that may be years old to begin to pinpoint problems that today might be dismissed as random failures.

Improving reliability for a reasonable cost requires improvements across the entire supply chain, and it takes patience. Sometimes problems don’t show up for years. But there are lots of little problems that need to be fine-tuned everywhere, and together they can have a big impact on yield, reliability and ultimately cost.



Leave a Reply


(Note: This name will be displayed publicly)