New techniques go beyond improved deposition and etching, but challenges stack up, too.
The chip industry is pushing to quadruple the stack height of 3D NAND flash from 200 layers to 800 layers or more over the next few years, using the additional capacity will help to feed the unending need for more memory of all types.
Those additional layers will add new reliability issues a number of incremental reliability challenges, but the NAND flash industry has been steadily increasing the stack height for nearly a decade. In 2015, Toshiba announced the first 16-die stack using through-silicon vias. That enabled higher bandwidth, reduced latency, and faster I/O, while also helping to pave the way for stacking of other types of memory and logic chips.
“Originally, NAND was scaling horizontally,” said Tae Won Kim, vice president for etch productivity and equipment intelligence at Lam Research. “But more than 10 years ago, manufacturers realized that lateral scaling itself is not going to be cost-effective, so they switched to vertical scaling.”
Stacking die opened the door to significantly higher density and much faster data access times. “The direction for 3D NAND is moving toward 500 to 1,000 layers,” said Mohan Bhan, general manager at ACM Research. “But achieving so many layers won’t come from just doing more of what we’ve been doing.”
The primary conventional processing issues relate to high-aspect-ratio (HAR) etching and deposition to ensure a consistent void-free string through all those layers. Channel height is also challenging the read current because of the increased total resistance of the polysilicon channel. Consequently, some developers are turning to a two-wafer solution using hybrid bonding, but these improvements will go only so far.
“While cutting-edge manufacturers are always looking at increasing layer count, additional scaling/stacking of layers is limited by etch budgeting and patterning challenges, among other items,” said Daniel Soden, business development manager at Brewer Science.
But the fastest way — and maybe the only way — to get to 1,000 layers will be string-stacking.
More bits please
The industry does have ways of increasing memory capacity without adding more layers. “NAND manufacturers can scale not only vertically, but also laterally and logically,” said Lam’s Kim.
Logical scaling increases the number of bits stored in a single flash cell, whereas lateral scaling reduces the pitch between cells. In addition, researchers are experimenting with splitting the column in two, doubling the number of cells in total. Various ideas are in play, but the effect will be to bring down the pitch and store more data in the same area. “Scaling the pitch of this charge trap architecture is a good way to improve capacitor density on the device without further increasing layer count,” said Brewer’s Soden.
Another capacity boost involves packing more data into a single cell. Storing multiple bits in one cell isn’t a new idea. Companies are shipping multiple-level cells (MLC) with two bits per cell, three-level cells (TLC), and quad-level cells (QLC). Developers are now approaching five levels per cell (penta-level cells, or PLC). The algorithms to manage such tiny differences in charge states (31 levels plus empty) are likely to be more complex, as is error correction, so performance may suffer.
Exactly how PLC is achieved and the content of the trap oxide is currently unclear, and some research suggests that floating-gates may make better PLC cells. There is even work on hex-level cells (HLC), storing six bits per cell. This is still in research, however.
SK hynix has a method of splitting the cell into two three-bit halves for a total of six bits. And seven-bit cells have been toyed with at cryogenic temperatures to reduce noise and increase read fidelity.
Keeping uniform layers
The fundamental benefit of stacking 3D NAND is that you get hundreds of layers using a single lithography step to pattern all of them. The downside is that drilling holes through them becomes harder, particularly with high aspect ratios nearing 100:1.
It might seem beneficial to make each layer thinner to add layers without making the stack too much higher. “The layer thickness is in the range of 150 to 100 angstroms,” said Bhan. But that thinning of the word-line layers would make them more resistive, hurting performance. Some researchers are exploring replacing the tungsten metal with ruthenium or molybdenum, which offer lower resistance. But for product development, layer thickness is staying put for now.
It’s not just the etching that’s a challenge. Adding extra layers while maintaining good planarity is harder. Minor errors that might previously have been forgivable now accumulate, making them too big to ignore at the top of a taller stack.
The stack consists of alternating layers of SiO2 and S3iN4 initially, although the nitride will eventually be removed and replaced with gate metal. At each generation, as the stack has grown, focus has remained on keeping the layers as uniform as possible. Slight errors can be tolerated, but they tend to multiply as the stack grows, meaning that each generation must work harder to improve planarity.
Fig. 1: Poor planarity and uniformity in a 3D NAND stack. Source: ACM Research
Rotating a wafer during deposition is a technique ACM Research employs to improve planarity. As deposition progresses, the company’s equipment periodically lifts the wafer and rotates it 180°, somewhat like football teams reversing direction each quarter. “The requirement for wafer rotation during the deposition process and the uniformity of the whole process is going to be very important,” said Bhan.
To do this, the rotating chuck lifts the wafer off the platen, turning it and laying it back down. That platen is heated, so the rotation must occur quickly to maintain the wafer temperature. But the wafer cannot be turned continuously (slowly) during deposition since the platen is fixed. “We rotate the wafer [periodically] to ensure that the deposition is more uniform,” explained Bhan. “We have made quite a lot of progress, bringing uniformity within 1%.”
The company also controls the deposition pressure to compensate for tensile stress in Si3N4 and compressive stress in SiO2.
Stacking issues
As the number of layers in the stack increases, so do the potential problems. “Physical and thermal stress from the higher stack height can create additional challenges for lithography and other downstream processes,” noted Brewer’s Soden.
This is particularly evident in the etching process. What must be a straight, uniform column may be distorted by differing lateral etch rates at different layers, differences in critical dimensions between the top and the bottom, incomplete etching, and even migration or twisting of the column off center.
Fig. 2: Etching holes for the channels can also experience challenges that rise as the stack grows taller. Source: ACM Research
The etch process must be incredibly uniform and must balance tradeoffs to ensure that productivity isn’t compromised. “If we really want to enable vertical and lateral scaling at the same time, we not only increase the etch rate, but also improve profile control,” said Lam’s Kim.
Effective etching requires that the hardmasks defining the pattern atop the stack maintain good fidelity. “More robust pattern transfer solutions are being investigated, such as thicker hardmasks and more intrinsically resistant materials,” said Soden. The predominant material employed today is called alpha-carbon (a brand of amorphous carbon), and it’s very hard. It’s deposited through chemical vapor deposition (CVD).” Brewer Science has launched a material that it believes can be just as effective, but it can be spun on, which simplifies the process.
“The density and hardness of this [alpha-carbon] are its strong suits, comparable to diamond and very robust for etch processes,” said Soden. “Replacing this material and process with a spin-on material could bring additional flexibility, higher throughput, better gap fill, and other properties that can be beneficial to a variety of devices and segments of our industry.”
Once the columns are etched, they must be cleaned and dried, and this also becomes more difficult. “Once you’ve done that HAR etch to land down in the bottom, you will have residues in there,” said Sally-Ann Henry, chief technologist at ACM Research. “And the problem is, it’s a very deep aspect ratio. Our [ultrasonic solution] can help get the liquid into there, but then how do you get liquid out? You can probably get water in and out, but drying is a big issue.”
Techniques for improving these steps include the use of ultrasonic agitation to encourage the cleaning materials into every corner of the column, and supercritical CO2 to dry it out after cleaning. The supercritical phase of CO2 occurs at high temperature and pressure, giving the material properties of both a gas and liquid. To assist, isopropyl alcohol can help both with pattern stabilization prior to cleaning and with flushing the chamber after cleaning is completed.
A single-crystal channel
When fully constructed and filled, each array column forms what’s called the Macaroni structure: a concentric arrangement with the outside consisting of the trap oxide followed by the channel material and then inert filler oxide in the middle. The trap oxide is where the charge is stored for each cell. The channel becomes the bit line or string, carrying current all the way up to the bit-line contact. The purpose of the filler portion is simply to make the channel narrower, improving gate control.
Fig. 3: The Macaroni structure for 3D NAND. Charge is stored in the trap oxide, and the channel forms the bit line. The filler’s purpose is simply to narrow the channel for improved gate control. Source: Bryon Moyer/Semiconductor Engineering
The channel itself is typically polysilicon, which is somewhat resistive owing to the many grain boundaries along the column. Although it’s worked well enough for the current generations of flash, as the stack gets taller, it becomes harder to sustain read current all the way to the contact. For that reason, some companies have come up with ways to generate a single-crystal channel. One approach grows silicon up from the bottom. Another crystalizes polysilicon from the top.
Applied Materials noted the prior experimental use of selective epitaxial growth to create a single-crystal channel. But to protect the CMOS thermal budget during processing, that growth was done at 810°C, resulting in growth that was too slow for volume manufacturing. The company can achieve a growth rate of more than 400nm/min at temperatures in the range of 900 to 1,100 °C. While this might pose an issue for traditional 3D NAND processing, a new proposed technique would allow it — building the memory cells and logic on different wafers and hybrid-bonding them together.
A configuration called CMOS below (or under) array, or CBA/CUA, places the cell array on one wafer and the rest of the CMOS circuitry on another. The two come together using hybrid bonding. Because of the face-to-face nature of the bond, the array and staircases are now upside down, and contacts can be much shorter, which is a benefit in its own right.
Fig. 4: CMOS below array configuration. The cell structures are built on one wafer, inverted, and then hybrid-bonded to the wafer containing the CMOS circuitry, shortening connections and allowing the array wafer to employ higher-temperature process steps. Source: Bryon Moyer/Semiconductor Engineering
But for the purposes of epi growth, this allows the array wafer to grow epi at higher temperatures than the CMOS can tolerate, providing one way of making a single-crystal channel. One resulting change, however, is that the filler oxide disappears as the channel occupies the entire middle of the cylinder. That results in reduced gate control as a tradeoff. The improved single-crystal channel performance would need to have a greater positive impact to make that tradeoff worthwhile.
The two-wafer technique is also much more expensive. But it’s being developed independently of the epi-growth effort to free up the array for any other CMOS-unfriendly processing. It also requires twice the wafers for the same amount of flash chips. That’s a concern for cost, wafer-demand, and the environment.
For this application, silicon in the carrier wafer for the array isn’t consumed. All the useful layers are deposited on top of that wafer. The typical approach after bonding the two wafers is to grind or etch the carrier wafer away, wasting that silicon and adding cost. Efforts are underway to see what kinds of techniques can rehabilitate the surface of the reclaimed wafer so that it is as effective as a fresh one.
A top-down approach
A different way of creating such a channel doesn’t require two wafers. Instead, the channels are filled with polysilicon, as is done traditionally. However, prior to annealing, nickel silicide is deposited over the channels. During the annealing process, that silicide floats down from the top to the bottom, catalyzing crystallization along the way. When it reaches the bottom, everything above it is a single crystal. The silicide remains at the bottom, but the bit-line contact is at the top, so it shouldn’t cause a problem (assuming it remains in place).
Fig. 5: Using nickel silicide to crystallize the channel. The material migrates down the channel during annealing, crystalizing the polysilicon along the way. Source: Bryon Moyer/Semiconductor Engineering
Stack and repeat
A final twist on adding layers provides something of an end-around, both physically and geopolitically, to the plodding progress made processing ever deeper holes. The kinds of improvements discussed above help raise capacity, but only by so much.
“These kinds of solutions may be nearing the limit as layers reach 250-plus,” noted Soden. “Stepwise approaches are being implemented to break pattern and etch processing into different modules as a way to reduce extreme HAR etch, introducing bare silicon between layers and connecting through a via approach.”
Sometimes called string stacking, the idea is to build a manageable set of layers and, rather than trying to make that stack higher, simply duplicate the stacks atop each other with a layer of silicon in between each stack. The result can be a combined stack of many more layers without all the extended HAR issues. “This solution is what is driving many companies to push to as many as 1,000 layers long term,” said Soden.
Fig. 6: String stacking. Each set of layers independently goes through the normal process. Stacking the independent strings allows a greater number of layers without having to process the full stack in one step. The tradeoff is that multiple steps are necessary. Source: Bryon Moyer/Semiconductor Engineering
The engineering end-around is that one can get, say, 1,000 layers without having to process them together. Instead, one can process, say, 250 layers and then stack four of those modules with intervening silicon layers. The tradeoff is that four lithography steps are needed instead of one, but that may be a suitable tradeoff. No one appears to be discussing trying to process 1,000 layers the old-fashioned way.
It’s not quite as easy as it sounds, because the second tier will be placed atop the first one, not atop a pristine flat wafer. The third tier will have to work atop whatever irregularities accumulate on the second tier. It may well be that each tier will require a separate development effort to ensure sufficient planarity.
The other challenge is that somehow the strings in each tier must be connected to form one long string. The simple answer is to place a via in the silicon separation layer, but aligning each tier above the prior one with great precision isn’t obvious — especially since the silicon layer will block the columns below from visibility.
From a geopolitical standpoint, export rules restrict stacks of more than 128 layers. So countries subject to those restrictions get an end-around by simply stacking 128-layer modules. If YMTC, for example, which was first to release string-stacked products, is to achieve 1,000 layers, it’s likely to do so using 10 stacks of 100 layers each.
A couple years to sorting this out
NAND flash improvements involve many moving parts. Efforts to improve HAR processing will continue, but they’re not where the big gains are. On paper, PLC technology provides an immediate 25% capacity boost. Cell architecture changes and reduced pitch can help yet more.
The biggest changes are the major architectural shifts moving to a two-wafer solution and stacking strings. They can come in addition to the other capacity bumps. Products with both technologies are available today, although not at 1,000 layers. Reduced CBA costs are necessary for proliferation, and work is necessary to scale the number of stack tiers.
Exactly what the mainstream configurations will look like isn’t clear yet, but one way or another, much larger NAND flash chips are coming to slake the unending thirst the industry has for storage.
Related Reading
Hybrid Bonding Makes Strides Toward Manufacturability
Companies are selecting preferred flows, but the process details are changing rapidly to meet the needs of different applications.
Optimizing Wafer Edge Processes For Chip Stacking
Several critical processes address wafer flatness, wafer edge defects and what’s needed to enable bonded wafer stacks.
Defect Challenges Grow At The Wafer Edge
Better measurement of edge defects can enable higher yield while preventing catastrophic wafer breakage, but the number of possible defects is increasing.
Leave a Reply