Reliability Over Time And Space

Challenges in the march toward known good systems.

October 13th, 2020 - By: Ed Sperling

The demand for known good die is well understood as multi-chip packages are used in safety-critical and mission-critical applications, but that alone isn’t sufficient. As chips are swapped in and out of packages to customize them for specific applications, it will be the entire module that needs to be verified, simulated and tested, and analyzed.

This is more complicated than it sounds for several reasons. First, the expected lifetime of these devices is increasing. Most of the leading-edge chips used in the past were either in servers, where the life expectancy averaged four to five years, or in smart phones, where the life expectancy was two years. The expected lifetime of chips in both of those sectors has increased. In the case of automotive applications, it could be as long as 18 years.

At the same time, the features inside a chip have decreased. The result of decades of device scaling is more noise from power, electromagnetic interference, and a handful of proximity effects due to more switching with less insulation. Resistance in thinner wires increases heat, and thinner dielectrics provide less insulation for all of the above.

To circumvent these and other issues, including the inability to deliver enough power to turn on all circuits at once, chipmakers have begun checker-boarding logic, with some blocks on, some off. That has a dual benefit of minimizing physical effects and reducing circuit degradation. But it also makes it harder to understand aging patterns across a device and across a system.

Second, the part of a multi-chip device that fails first typically is the bond between multiple die, or between the package and the board. In the past, this was largely due to warpage and the packaging process itself, which put strain on the solder balls. Increasingly, the physical effects that impact a single chip are moving out to the board, as well. Distances between components on a board matter as much as they do on a single chip or in a multi-chip package, and density is increasing for components that previously were kept well separated in the past, often for good reasons.

There are more layers in boards, and much more data moving everywhere. That means more silicon is on than off, and all of this is now functioning as part of a system of systems, where the impact of overheating in one area may affect something completely unrelated from a system architecture standpoint.

Third, not all of the components in these devices are being characterized on an apples-to-apples basis. It’s impossible as an IP provider — regardless of whether it’s hard or soft IP, or whether it’s internally or externally developed — to predict what will be in the proximity of that IP block. While the specs may look similar, they may be susceptible to different physical effects that never were considered in the design process. They also may be used for much longer periods of time, in different markets, than what they were originally design to do.

Tools and various technologies are available today to monitor chip behavior and degradation over time, but the real challenge is to automatically identify and fix these problems. This is well beyond the capabilities of any single company. It will require standards from within the chip industry, as well as different groups within various industries to come together. And it will require more redundancy, more firmware, more security to guard that firmware — all of which are essential to ensuring these devices function properly for as long as they are expected to do so.

We are at the point today where at least we can identify the problems. The next step is to fix them, and that may be more difficult than usual, given all of the various technologies still being developed. This is no longer about the next rev of a smart phone or computer processor or memory. It’s about adding AI and ML into nearly everything as markets demand customized solutions. The concerns may be very different in a car than in an industrial robot or a 5G base station, but failures in any of them can cause significant problems.

Ed Sperling

(all posts)
Ed Sperling is the editor in chief of Semiconductor Engineering.

Reliability Over Time And Space

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Buried Si/SiGe Interfaces Investigated Using Soft X-Ray Reflectometry and STEM-EDX

ECTC 2024 Session Readout: Advancement of Metrology

Strain Engineering in 2D FETs (UCSB)

Review of Automatic EM Image Algorithms for Semiconductor Defect Inspection (KU Leuven, Imec)

Steps to Fabricate Nanotips Overhanging From Chip Edge By a Few Micrometers (CNRS, CEA-Leti)

Method To Determine The Permittivity of Dielectric Materials in 3D Integrated Structures At Broadband RF Frequencies

Scalability of Nanosheet Oxide FETs for Monolithic 3-D Integration

Monitor Etch Defects on Dies in the Outer Regions Of The Wafer Using ISR

Sponsors

Recent Comments

About

Navigation

Connect With Us

Reliability Over Time And Space

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Buried Si/SiGe Interfaces Investigated Using Soft X-Ray Reflectometry and STEM-EDX

ECTC 2024 Session Readout: Advancement of Metrology

Strain Engineering in 2D FETs (UCSB)

Review of Automatic EM Image Algorithms for Semiconductor Defect Inspection (KU Leuven, Imec)

Steps to Fabricate Nanotips Overhanging From Chip Edge By a Few Micrometers (CNRS, CEA-Leti)

Method To Determine The Permittivity of Dielectric Materials in 3D Integrated Structures At Broadband RF Frequencies

Scalability of Nanosheet Oxide FETs for Monolithic 3-D Integration

Monitor Etch Defects on Dies in the Outer Regions Of The Wafer Using ISR

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored