Some progress is being made, but there are no easy answers.
One of the more important challenges in reliability testing and simulation is the duty cycle dependence of degradation mechanisms such as negative bias temperature instability (NBTI) and hot carrier injection (HCI). For example, as previously discussed, both the shift due to NBTI and the recovery of baseline behavior are very dependent on device workload.
This is somewhat intuitive behavior. It’s not surprising for a stress applied more frequently or for a longer time to cause more degradation than one applied less frequently or for a shorter time. If the stress-induced damage is recoverable when the device is off, then it’s somewhat predictable that a device used less frequently will be more easily able to recover.
Nonetheless, the degradation of devices under stress is notably different from changes in baseline performance due to process variation. As ARM fellow Rob Aitken and his colleagues at ARM Research observed at December’s IEEE Electron Device Meeting (IEDM), sources of device variability — lithography or implant variation, for instance — are typically uncorrelated. The devices along a given logic path will vary randomly, and so variations will tend to cancel each other out. Typically, the worst-case process variation scenario produces many devices that are slightly worse than average, rather than a few extreme outliers.
Duty cycles, in contrast, are highly correlated. It’s possible, or even likely, for a small subset of the chip to see much more stress than less commonly used logical paths. For this reason, it is difficult to make accurate reliability predictions without information about the expected workload.
The more dramatic the duty cycle dependence, the more difficult it is to study. To this day, it is not clear whether the NBTI behavior seen in most devices is entirely recoverable or has a permanent component. Because the recovery time for any single defect can be very long, it’s hard to measure the difference between a device that is permanently damaged and one that will eventually recover, given sufficient “off” time.
In another IEDM paper, Tibor Grasser of the Institute for Microelectronics and his colleagues devised a method for estimating the permanent component of NBTI. After a standard hundred-second recovery period and ten current/voltage sweeps, the group attributed any remaining shift to damage they “pragmatically called permanent” (henceforth simply “permanent”). While they acknowledge that this shift probably contains a “slow” long time constant recoverable component as well as a truly permanent component, the two combined accounted for only about 10% of the total observed shift.
They attributed the “permanent” NBTI component to hydrogen released from defects on the gate side of the oxide, while the recoverable component appears to be due to traps at the channel/dielectric interface. While hydrogen has been considered and rejected as insufficient to explain all NBTI-induced shifts, it can still account for the much smaller “permanent” component.
What about SOI?
The industry is increasingly considering alternatives to planar silicon transistors, in part because NBTI becomes more serious for highly scaled devices. One of these, fully-depleted silicon-on-insulator, is especially attractive for its ability to balance power consumption against performance within an individual die. Its advantages may bring new reliability issues, though. According to an IEDM paper presented by Sanghoon Shin and colleagues at Purdue University, the high electric field present in highly scaled SOI devices can raise the channel temperature significantly relative to the underlying substrate. This is somewhat counterintuitive behavior. A thinner, undoped channel increases the source/drain electrical resistance, thereby reducing the current and power dissipation. Less power dissipation generally means less heat. However, the thin channel also increases thermal resistance. Temperature change is a function of thermal resistance times power, so substantial heating can occur with very thin channels. Ultimately, manufacturers will need to balance the electrostatic advantages of thin channels against the reliability issues associated with self-heating and hot carrier injection.
FinFETs?
Self-heating is also an important consideration for finFETs, for much the same reason. Dissipating heat from isolated fins is challenging. Indeed, as previously discussed, some of the NBTI advantage of these devices is likely to be offset by more serious HCI degradation.
HCI, too, depends strongly on duty cycle. Reliability simulations need to consider actual proposed circuit designs, not just the behavior of individual devices. In fact, researchers at IBM found that even the duty cycles of adjacent devices may need to be considered. If an off-state device (A) is located next to an on-state device (B), gate leakage from device B can cause leakage current to flow in device A. When the thermal resistance of device A is high, the heat resulting from the leakage current will dissipate more slowly, leading to localized hot spots and potential reliability degradation. Overstressing the device during burn-in or reliability testing can even drive electromigration in the associated back end of line wiring. (A similar effect can also occur in off-state devices in system-in-package designs, as will be discussed in a future article.)
Okay, can we abandon silicon?
None of these issues go away with the potential introduction of alternative channel materials. Indeed, some of them are likely to get worse. For example, SiGe may offer reduced NBTI compared to silicon, but germanium is a poor thermal conductor, more prone to self-heating and the degradation that comes with it.
In work presented at the Materials Research Society Spring Meeting (paper EP11.8.01) last month, Jacopo Franco, a researcher in IMEC’s CMOS FEOL reliability group, explained that the biggest concern for SiGe reliability is the interface between the channel material and the gate stack. The native GeO^x dielectric is much easier to fabricate than a fully oxidized SiO² cap layer. Sadly, the reliability of devices with a GeO^x interfacial layer is quite poor. Oxide defect levels are widely distributed, with little voltage dependence. Similar behavior is seen in InGaAs systems with Al²O³ interface oxides, which suffer from both a shift and subthreshold slope degradation.
In both cases, the ideal gate stack would be one with a narrow distribution of defect levels, and where those levels were misaligned with the Fermi energy of the device. Illustrating one possibility, researchers at IMEC placed a thin lanthanum layer on top of silicon-passivated germanium nFET, shifting the trap levels relative to the HfO² dielectric. A much stronger voltage dependence of degradation behavior was seen.
Conclusions
Looking over the six months or so of progress, it is clear that there are no easy answers to the reliability challenges posed by highly scaled devices and alternative transistor structures. Rather, the solutions appear to require more careful simulation of the environment facing each individual device, more careful optimization of gate stacks and heat dissipation, and more close collaboration between designers and process engineers.
Related Stories
Reliability Adds Risk Over Time
Are Chips Getting More Reliable?
The device is Purdue paper is a thick BOX FDSOI (150nm), which is known to have thermal resistance concerns. The industry uses thin BOX (25nm or less) devices.