Challenges continue to mount, but most of them appear surmountable with enough investment.
Memory vendors are racing to add more layers to 3D NAND, a competitive market driven by the explosion in data and the need for higher-capacity solid state drives and faster access time.
Micron already is filling orders for 232-layer NAND, and not to be outdone, SK Hynix announced that it will begin volume manufacturing 238-layer 512Gb triple level cell (TLC) 4D NAND in the first half of next year. Perhaps even more significant, chipmakers say privately they will leverage industry learning about stacking NAND for 3D-ICs, which currently are in development.
“Moore’s Law for processors has arguably been lagging the last few years, but is alive and well for NAND flash,” said Ben Whitehead, technical product manager at Siemens EDA. “It’s a good thing, because modern compute and networking have an insatiable appetite for fast storage.”
SK Hynix introduced its 4D nomenclature in 2018 with 96-layer NAND. Despite the name, the company did not create its product in four-dimensional space or mimic a tesseract cube. But the term isn’t entirely a marketing gimmick, either. It’s a tradename for a variant of 3D architecture.
“For DRAM, it took something like 10 or 15 years of R&D to come to fruition, but for 3D NAND, the development was extremely fast. It’s astonishing when you think of the usual pace of development,” said Xi-Wei Lin, director for R&D at Synopsys. “Besides the technology itself, it’s a killer app. Apple was first to put in a flash memory, to store data. Today, we buy iPhones still based on how much memory, and it’s all flash. From there, big data, AI, and also analytics require high performance computation. Flash memory is filling this critical gap in latency between the hard disc drive and RAM memory. You can see the applications, especially in the data center, analytics, and gaming, because of the power, form factor, and the density cost.”
Evolution and revolution
Looking back to 2D NAND, it had a planar architecture with the floating gate (FG) and the peripheral circuitry next to each other. In 2007, as 2D NAND was reaching its scaling limit, Toshiba proposed a 3D NAND structure. [1]
Samsung was first to market with what it called “V-NAND” in 2013.
The 3D design introduced alternating layers of polysilicon and silicon dioxide and swapped the floating gate for charge trap flash (CTF). The distinctions are both technical and economic. FGs store memories in a conducting layer, while CTFs “trap” charges within a dielectric layer. The CTF design quickly became preferred because of manufacturing cost reductions, but certainly not the only one.
“Although all manufacturers moved to charge trap cell architectures, I expect that the traditional floating gate cell will still play a non-negligible role in future, especially for capacity or retention-sensitive use cases,” noted IBM researcher Roman Pletka.
Nevertheless, despite the innovation of skyscraper-like stacking, first generation 3D NAND design kept the peripheral circuitry to the side, Hynix has stated.
Eventually, 3D NAND vendors moved the peripheral circuitry under the CTF. In SK Hynix’s terminology, it was now the Periphery Under Cell (PUC) layer. On one hand, it’s a lot shorter and cooler to say “4D NAND” than CTF/PUC NAND. On the other, ultimately this is another variation of 3D NAND, with a smaller cell area per unit. Similar designs for smaller footprints go by different tradenames, such as CMOS under Array (CuA) from Micron.
Fig. 1: SK Hynix’s explanation for 4D NAND. Source: SK Hynix global newsroom.
Fig. 2: Peripheral circuitry is an underlayer in 4D NAND. Source: SK Hynix global newsroom.
Micron itself scored bragging rights in late July 2022 with the announcement of a 232-layer NAND, which is in production. Micron said its 232-layer NAND was a watershed moment for storage innovation as first proof of the capability to scale 3D NAND to more than 200 layers in production, according to the company’s news release.
“The main thing adding those layers does is add capacity, because everyone is looking for more capacity in their SSDs,” says Marc Greenberg, group director, product marketing at Cadence. “So, adding more layers basically means that you’ve got more gigabytes that you can store in a single package on a single sort of assembly of multilayer 3D NAND. It’s a capacity play, to add all those layers and the technology behind it.”
Micron also claims the industry’s fastest NAND I/O speed ‒ 2.4 Gbps ‒ and up to 100% higher write bandwidth and more than 75% higher read bandwidth per die than the prior generation. In addition, 232-layer NAND contains six-plane TLC production NAND, that Micron said is the most planes per die of any TLC flash and is capable of independent read capability in each plane.
According to industry analysts, this may be the most impressive part of the announcement. Because of the six planes, this chip can behave as if it’s six different chips.
Fig. 3: Micron’s 232-Layer NAND. Source: Micron
China’s Yangtze Memory Technologies Co. (YMTC) also announced a 232-layer 3D NAND module. When that will enter volume production is unknown.
Manufacturing: Advantage and challenges
At last year’s IEEE IEDM forum, Samsung’s Kinam Kim gave a keynote in which he predicted 1,000-layer flash by 2030. That might sound head-spinning, but it’s not total science fiction. “That’s already slowed with respect to what has been the historical trend line of NAND flash,” observed Maarten Rosmeulen, program director, storage memory at Imec. “If you look at the others, like Micron or Western Digital, what they put forward in their public announcements, they are even slower than that. There is also a little discrepancy between the different manufacturers — it seems like they’re stretching out the roadmap, letting it slow down. We believe this is because of the very high investments that are needed to keep the space going.”
Still, the competitive stakes are high enough that those investments are inevitable. “The main way forward, the main multiplier, is adding more layers to the stack,” Rosmeulen said. “There is very little room to do an XY shrink and make the memory hole smaller. That’s very difficult to do. Maybe they will squeeze a few percent here or there, put the holes closer together, have fewer slits in between the holes and these kinds of things. But that is not where the big gains are. The density can only go forward significantly at the current pace if you can keep on stacking more layers on top of each other.”
Fig. 4: 3D Steps in NAND Manufacture. Source: Objective Analysis
Further stacking seems reasonable, except for the unavoidable problem at the heart of the whole process.
“The main challenges are in etch, because you have to etch very deep holes with a very high aspect ratio,” said Rosmeulen. “If you look at the previous generation with 128 layers, this was about a 6-, 7-, or 8 micrometers deep hole of only about 120 nanometers diameter, extremely high-aspect ratio — or maybe a little bit higher, but not that much. There are advances in the etch technology to etch deeper holes in one go, but it won’t go faster. You can’t increase the speed of the etch. So if the process flow get dominated by the deposition and etch, and those process steps don’t increase in cost efficiency, then adding more layers is not as efficient anymore to reduce the cost.”
Etch is only one of multiple steps, too. “Besides the etch, you also need to fill this hole with a very thin dielectric layer uniformly up and down,” said Synopsys’ Lin. “Typically, to deposit a layer of a few nanometers is not easy because of the chemistry of the wafer. Here, they have to go all the way down to be able to fill. There are sub-atomic layer deposition methods, but it’s still challenging. Another big challenge is stress. If you build up so many layers that go through some etch/deposition/cleaning/thermal cycle, that can cause stress locally and globally. Locally, because after you drill a hole, you need to cut a very deep trench through the full stack. It becomes a really high-aspect skyscraper, which is wobbly. And if you start going through some washing or other processes, a lot of things can happen to cause two skyscrapers to collapse against each other. So then you’ve lost the yield. And by putting so many materials on top of each other, and cutting different patterns, this can create global stress and can cause a wafer to warp, which will make it impossible to handle in the fab because a wafer has to be flat.
And that’s just for starters. Remember that the etch is going through layers of different materials.”
Samsung’s solution was to create extremely thin layers, said Objective Analysis’ Handy. “That’s been useful to the industry as a whole, because everybody uses pretty much the same tools to create these things.”
Making it work better
There’s also a functional challenge inherent in the fundamental concept of flash. “There’s increased reliance on requiring more and more powerful error correction algorithms to work with those devices,” said Cadence’s Greenberg.
The issue is there’s not a lot of intelligence built in to NAND flash devices. “Typically, in an SSD that happens on the controller side,” Greenberg explained. “The controller is sending commands to the NAND flash device, the NAND flash device responds, but it doesn’t have a lot of intelligence about that. It’s just responding to a request, say for a block of data at a particular address. And the NAND flash device will simply respond with that block of data. But on the controller side, you have to firstly do error correction on the data being received, and then determine if there’s an unacceptable number of errors in that block, and then make a decision about how do you how do you remap that block out of the address space and put a different block in its place. All of those decisions happen over on the controller side.”
Still, a world built of nanoscale skyscrapers puts new emphasis on components such as ONFI controllers and ONFI PHYS, and presents new challenges for designers.
“The number of layers memory fabs can produce dramatically complicates design verification problems for controllers that interface with those memories – and they might not be so obvious. The SSD controller has to deal with even more channels to the memory. Connecting many pipes with an increasingly fast (but never fast enough) host interface creates bottlenecks in very unexpected places,” said Siemens’ Whitehead. “Another design verification challenge is power. It has long been a lower priority by most storage controllers, but this has shifted now to a critical feature. Moving to smaller geometry nodes helps some but is expensive. Business models are intolerant of re-spins, not to mention the supply chain difficulty in getting in the queue. Time-to-market delays get a lot of visibility to upper management. There are even more growth drivers for storage, which require us to rethink how we verify designs. AI accelerators require much larger storage controllers, which may quickly consume your emulation and prototyping capabilities. Edge intelligence requires orders of magnitude more complex design verification. In-memory computing, like CSD, requires testing new processor combinations that mix RTOS and HTOS with previously unseen workloads.”
This is one of the reasons there is so much focus on verification IP.
“The automation with this IP can rapidly generate test benches to get design and verification teams up-and-running in minutes,” said Joe Hupcey, ICVS product manager for Siemens Digital Industries Software. “This level of productivity enables architectural exploration of the whole design to give early confidence in the chosen tradeoffs. In parallel, it also sets up the framework for automated of tracking of metrics — like code, functional, and scenario coverage — to enable teams to measure their progress and have the data needed to make a sign-off decision. And finally, building on our expertise with CXL/PCIe protocols, we see emerging standards like Universal Chiplet Interconnect Express (UCIe) playing a critical role in enabling teams to collaborate to rapidly design and verification these massively scalable memory modules.”
Additionally, Imec is exploring potential new structures for 3D NAND. It has demonstrated what it calls “trench architecture,” a design variant in which the memory cells are part of the sidewall of a trench, with two transistors at opposite ends of the trench. Jan Van Houdt, program director of ferroelectrics at Imec, explained its value: “The 3D trench architecture has the potential of double density as compared to the currently used gate-all-around (or cylindrical) architecture.”
However, he went on to point out a few drawbacks. “There are two high-aspect ratio (=challenging) etch steps instead of one, as well as a lower electric field in the tunnel oxide in the case of flash. The second drawback is not there when using ferroelectric FETs, which makes the trench version more appealing for ferro than for flash.” The design is still in the prototype phase.
Conclusion
In 2016, experts noted that due to technical issues, 3D NAND might run out of steam at or near 300 layers. That appears to have been supplanted by today’s cautious optimism.
“[After SK Hynix’s 238 layers] I expect to see an increase in the number of layers over the next years at roughly the same speed,” said IBM’s Pletka. “However, increasing the number of layers is challenged from the technology point of view due to the high aspect ratio etching process, but also by CapEx because the time to manufacture a chip increases with the layer count. This is why we will see new scaling directions by making thinner layers, lateral scaling such as denser placement of the vertical holes, and the use of more efficient layouts such as shared bitlines and logical scaling (e.g., using split-gate architectures or storing more bits per cell). With these technologies, it is expected that the storage density of NAND flash keeps growing at the similar rate at least for the next 5 to 10 years.”
Others agree. “There’s no physical limit when people say we can’t go past this number of layers,” said Jim Handy, principal analyst at Objective Analysis. “In the world of semiconductors, there are always people saying we can’t do this. We can’t do lithography down below 20 nanometers. Now, they’re looking at 1 nanometer. Samsung talked about 1,000 layers. It could be that in 20 years we’ll be laughing that we once thought that was a lot.”
Reference
[1] H. Tanaka and et al., “Bit Cost Scalable Technology with Punch and Plug Process for Ultra High Density Flash Memory,” in 2007 Symposium on VLSI Technology Digest of Technical Papers, 2007.]
Related stories
3D NAND’s Vertical Scaling Race
More competition, business uncertainty, and much more difficult manufacturing processes.
How To Make 3D NAND
Foundries progress with complex combination of high-aspect ratio etch, metal deposition and string stacking.
3D NAND: Scenarios For Scaling & Stacking
A new research paper titled “Impact of Stacking-Up and Scaling-Down Bit Cells in 3D NAND on Their Threshold Voltages” was published by researchers at Sungkyunkwan University and Korea University.
I also echo and Thank Karen for this very informative article. Great work indeed!!!