NVM Reliability Challenges And Tradeoffs

What’s solved, what isn’t, and why these different technologies are so important.


This second of two parts looks at different memories and possible solutions. Part one can be found here.

While various NVM technologies, such as PCRAM, MRAM, ReRAM and NRAM share similar high-level traits, their physical renderings are quite different. That provides each with its own set of challenges and solutions.

PCRAM has had a fraught history. Initially released by Samsung, Micron, and ST, it was subsequently removed from the market for reliability reasons before being re-introduced.

Heat remains a major problem for PCRAM. While the memory is thermally stable and can handle high-heat applications, heat generated to program one cell may affect its neighbor, as well — a so-called “disturb,” according to Tomasz Brozek, senior fellow and engagement director for PDF Solutions. In addition, the localized heating at the tip of the heater can cause some material movement nearby, resulting in a void above the cell.

“The tip of the heater is 600 to 700°C,” Brozek says. Intel apparently solved this problem with its Optane memory by replacing a single-point heater with a more distributed heater to mitigate this issue.

There are other problems, as well, involving the stability of the reset state, with the amorphous material spontaneously growing some crystals, Brozek notes. This makes it more difficult to use for analog memory. He says that ReRAM is more stable for this application.

In addition, it’s unclear if or when PCRAM will be manufactured in the 1xnm range. Martin Mason, senior director of embedded memory for GlobalFoundries, says it’s hard to find a foundry that will build PCRAM on a node below 28 nm.

MRAM is unique in having a three-way relationship between data retention, endurance, and speed. You can pick any two of those at the expense of the third. “MRAM is a beautiful memory,” says Brozek. “[It] has great tunability. There are many knobs to turn, including reliability.”

As a result, Mason says that GlobalFoundries is doing two versions of MRAM — one that’s more flash-like and one that’s more SRAM-like, each with a different set of tradeoffs.

The flash-like version is more robust, with a higher energy barrier in the MJT. It has high endurance at 100K cells, but that’s lower than the infinite endurance expected of an SRAM-like technology. It supports five solder-reflow cycles and can handle higher operating temperatures. The SRAM-like version has higher endurance but lower data retention, and it can support lower operating temperatures than the flash-like version.

MRAMs do have a risk of disturbing themselves after multiple write cycles, making endurance a greater challenge. A new version of MRAM, called spin-orbit transfer (SOT-MRAM), helps with that issue, but that is not yet a proven production technology.

Brozek also notes a challenge when etching the MRAM bit cell, which has become more cylindrical (instead of conical) over time. Because there are many layers of very different materials in the bit-cell stack, “MRAM can’t use RIE [reactive-ion etching] – it’s too selective. You would have to change chemistries to get down the stack.” Instead, they use neutral-ion etching, where ions are accelerated and then neutralized before sputtering onto the wafer. It’s hard to get all of the corners clean, and the etched material may redistribute elsewhere. These may become filaments over time, making this a processing challenge that must be carefully tuned.

As for temperature stability, Mason says that GlobalFoundries offers grade-2 memories today, and they’re working on achieving grade 1. “It will be a while before MRAM can get [to grade 1]. This is true for all novel NVM technologies,” notes Mason.

One recurring concern by customers is that of external magnetic fields’ effect on MRAM contents. Certain environments — proximity to point-of-sale machines, inductive motors, solenoids, disc magnets in speakers, and similar systems — create the source of this concern. For perspective, Mason notes that, “We’ve been using magnetic storage for decades,” from core memory to disk drives. “5 mm to 1 cm away, the field drops dramatically.” He says that the GlobalFoundries MRAMs are magnetic-immune to 500 Oe for 10 years at 125°C, on first order. There’s no inherent shielding in the package, although it is possible to add that at low cost.

There are other issues, as well. “The main breakdown mechanism for MRAM is the wear-out of its thin MgO barrier,” says Meng Zhu, product marketing manager at KLA. “When the barrier has defects, such as pinholes or material weak points, the resistance of the junction can gradually decrease overtime and can also lead to a sudden drop of resistance (breakdown). Stack interfacial roughness, MgO thickness uniformity are the typical causes of pinholes/weak points creation in the MgO layer. In the case of oxidized MgO, where a deposited Mg layer is converted into MgO by thermal or plasma oxidation process, the degree of oxidation can also lead to creation of pinholes due to different lattice parameters of Mg and MgO. Although directly detecting pinholes in sub-nm MgO layer is challenging, it is possible to lower the chance of imperfection in the barrier by ensuring a good process control such as film deposition uniformity and MgO stoichiometry using spectroscopic ellipsometry. Roughness and uniformity of the stack can also be monitored through electrical (CIPT) and magnetic (Kerr) measurements.”

Inline defect inspection can help detect such reliability killers as metal particles forming near or on the side of the MRAM pillars during the patterning steps that may short the MTJs over time. “As with all semiconductor technologies, best practices inline inspection and metrology process control enable best possible device performance and yield,” Zhu says. “Increasing inspection sensitivity, sampling and innovative data analyses techniques can have a positive impact on early detection of latent reliability issues – a continuous improvement goal which all manufacturers hope to achieve. As STT-RAM is still relatively in low volume, together, we are learning and driving such improvements with our customers.”

ReRAM has its own set of challenges, especially as a newer technology. Mason notes that it can have challenges above 85°C, where filaments can break, or above 10,000 write cycles. Adesto has made some changes in its implementation, which it calls “conductive bridge” memory, or CBRAM, to push past these limitations. Adesto’s ReRAM is of the filamentary type (hence the conductive bridge). It found that filaments produced by copper or solder electrodes can be as thin as one atom wide. That makes the resistance of the filament easy to perturb because it takes only a few atoms to move around to cause a change.

The company found that by switching from a metal to a semi-metal electrode, it got a much wider bridge that behaved more robustly, solving the issues Mason mentioned — as well as the solder issue —and giving an order of magnitude improvement in thermal stability and data retention. “Adesto is offering ReRAM products that operate at the same conditions as standard flash devices. [A paper it published] shows that our technology may be used at automotive operating conditions. Materials along with [algorithms] were significant in driving endurance improvements as well,” says Shane Hollmer vice president of engineering and co-founder of Adesto.

Disturb issues are mitigated by a selector that isolates the bit cell when addressed. Adesto uses a straightforward one-transistor, one-resistor (1T1R) cell, increasing the cell size somewhat due to the extra transistor. “There are interesting ideas for novel selectors for more density and stacking, but we’re not using them yet,” Hollmer says.

Although ReRAM, when used as an alternative to flash, doesn’t have the three-way tradeoff relationship that MRAM has, it would for applications targeting DRAM, according to Hollman – not something that appears to be a common application today. When it comes to analog use for machine learning, they don’t believe that a given array will remain in place for 10 years at 100°C because there will be regular updates every few weeks or months. If not, they do see it as a good practice to refresh periodically.

On an experimental basis, Adesto also has looked at changing the dielectric from Al2O3 to SiO2 in order to reduce bit cell shorts, which sometimes occur during erasure, down to a level that could be managed with ECC. In another experiment, it looked at reducing stochastic programming errors by using a differential scheme, where two cells are programmed to opposite states, and then the current is sensed differentially.

NRAM is by far the newest of these technologies. Nantero is targeting the DRAM market with NRAM, but it isn’t making the memories itself. Instead, it is licensing out the technology.

Fujitsu is the first publicly announced licensee. But as TongSwan Pang, Fujitsu senior marketing manager notes, the company isn’t targeting DRAM. Instead, it is using NRAM as an NVM to compete with flash. It has set up its manufacturing line and is fine tuning the die size. Fujitsu is trying to finalize the technology by the end of 2020, with a small, unaggressive standalone memory product available six months later, followed largely by use for bespoke SoCs. “All memory that we develop will be for embedded or niche markets,” says Pang.

That said, most of the reliability data available at this point comes from Nantero. Bill Gervasi, principal systems architect at the company, says that unlike all of the other technologies, there are no moving charges in this technique. That means there is no opportunity for them to be trapped or to cause damage, giving this a potentially long data retention and high endurance. “We’ve tested to 1013 cycles and haven’t seen any wear-out yet.”

The basic technology has had one rather unusual proof point: “This was on the space shuttle during the Hubble repair,” he says, so it has been shown to withstand the rigors of space. Nantero modeled the data retention as being extremely high, seeing degradation at 300°C. But that degradation, by the company’s calculations, degrades the data retention from 12,000 years down to 300 years. These numbers would, of course, have to be proven out in a full-on product to confirm they’re real.

Other memory technologies are in the early stages of evaluation. In a report on the ramp-up of emerging memories, Handy and a colleague describe a few other technologies under exploration. There’s a variant of MRAM that leverages electric fields to coerce electron spin; it’s called magnetoelectric RAM (MeRAM). There’s ferroelectric RAM (FeRAM), which leverages electric dipoles to store data. And there’s polymer ferroelectric memory (PFRAM), which sets the state of a polymer to store a value. These are too early for reliability evaluations, but it’s important to note that the main memory contenders today may not remain the only ones available.

As to which memories are most likely to succeed, well, even analysts are reluctant to make a call. A report by Jim Handy, general director of Objective Analysis, forecasts DRAM, NAND flash, MRAM, and “3D XPoint” – Intel’s Optane technology. But, he notes, “…what we represent as embedded MRAM could become ReRAM or possibly even some other emerging memory, if that emerging memory ramps faster.” He’s not declaring MRAM to be the winner. It’s a stand-in for whichever memories emerge alive from the competition.

Source: Objective Analysis & Coughlin Associates, 2019, slightly rounded. Units are petabytes.

While reliability will make or break some technologies, it’s not going to be the ultimate arbiter of success. “Cost is the single absolute most important thing,” says Handy. And incumbent stickiness is still a challenge. “Some folks are using on-chip cache and external NOR flash” to address memory challenges rather than going to a new technology.

Then, of course, there’s the ultimate shakeout as marketing messages meet physical reality, for good or ill. Says Mason: “All NVM looks fantastic on paper, but the devil’s in the details. [The challenge is] making the specs shown at [conferences like] ISSCC work at PVT [corners] with 6-sigma [3.4 ppm] over 10K wafers.”

Or as Handy summarizes it, “There’s an awful lot of bravado from a lot of the companies. It’s hard to sort out what’s real and what’s hype.” The next couple of years will prove that out.

Leave a Reply

(Note: This name will be displayed publicly)