Gate-all-around is set to replace finFET, but it brings its own set of challenges and unknowns.
While only 12 years old, finFETs are reaching the end of the line. They are being supplanted by gate-all-around (GAA), starting at 3nm [1], which is expected to have a significant impact on how chips are designed.
GAAs come in two main flavors today — nanosheets and nanowires. There is much confusion about nanosheets, and the difference between nanosheets and nanowires. The industry still does not know much about these devices, or how significant some of the issues will be long-term. As with any new device, the first generation is a learning vehicle and improvement are made over time.
Why are we making this change? “If finFET pitch could continue scaling, people would have stayed with finFET,” says Julien Ryckaert, vice president of R&D at imec. “The problem is finFET cannot scale simply because you need to plug the gates, work function stack, in between two fins. By the nature of how these devices are constructed, you’re forced to separate two fins by 15 to 20 nanometers. So you have this cliff. Because of that quantization, if you keep scaling your standard cell by 1 nanometer, you reduce your active area by 1 nanometer, and this can induce a whole fin to disappear. That’s the moment where people said, ‘We need to find a solution.'”
Fig. 1: Planar transistors vs. finFETs vs. gate-all-around Source: Lam Research
Gate-all-around (GAA) is similar to finFET. “FinFETs turned the planar transistor on its side (see figure 1), so that the fin height became the width of the equivalent planar transistor,” says Robert Mears, CTO for Atomera. “Since processing constraints fixed the fin height, the transistor width could only be varied in discrete amounts by using additional fins. GAA returns to a planar geometry, but now with vertically stacked planar nanosheets. In principle, therefore, the width can be continuously varied.”
That is unlikely to happen. “There will be more flexibility in adjusting the active width because it is a planar structure, and you could theoretically vary the sheet width continuously,” says imec’s Ryckaert. “However, it is highly likely that foundries will restrict the designer’s ability to play with arbitrary nanosheet widths and they will force restrictions.”
That is most likely because of the time and difficulty creating models. “Every device size must be individually characterized, qualified, and modeled, increasing the costs of developing the PDK,” says Atomera’s Mears. “At the library level, we can expect better optimization of logic and SRAM using width as an additional variable to optimize power-performance tradeoffs.”
Variability drives GAA
But the biggest issue for moving to GAA is variability, which is a key factor in yield and performance.
“Say you have technology A (see figure 2), where you have a certain distribution of transistors strength, as measured by driving current of the transistor,” says Victor Moroz, fellow in the TCAD product group of Synopsys. “There’s some nominal behavior and some distribution. A billion transistors on the chip cannot be the same. Some are slightly off. Usually, it’s something like a Gaussian distribution. What’s important to the circuit designer is not the nominal behavior, but the process corners, which is something like nominal minus three sigma. Imagine you have another technology B, which has better nominal performance, but has wider variability. If it’s considerably wider, it might be that the designers are forced to design to this process corner, and then having better nominal performance is useless. GAA technology is a way to control or maybe even reduce variability.”
Fig. 2: The impact of variability. Source: Synopsys
As finFETs get smaller, variability increases. “When a finFET goes to one fin, variability can become extremely problematic,” says Ryckaert. “There are good signs that the mechanisms responsible for variability may be better controlled in a nanosheet. One of the big issues with finFET is the fin profile, which can induce quite large variability at the foot of the fin. With nanosheets, because you’re starting from a pre-defined superlattice with epitaxial growth, these stacks are controlled by atoms. The thickness of the nanosheet is controlled to the atom, so your sheet thickness, which is a very important source of variation, will have better control.”
Nanosheet versus nanowire
These terms are almost used interchangeably, but they are not the same thing. “A nanowire was an idea of having full control on the channel, by having the gate wrapping around a circular silicon channel,” says Ryckaert. “That’s where you would get the best electrostatics, the best channel control.”
But it is a tradeoff. “While the nanowire does indeed improve short-channel control, it degrades drive current due to its small geometry, typically of the order of 5nm by 5nm,” says Mears. “The nanosheet structure is part-way between a finFET and a nanowire. The height of the sheet is again about 5nm, but the width is much larger and can be varied continuously. Gate electrostatic control is better than the finFET but worse than a nanowire, because while the nanosheet’s gate does surround all four sides (hence the term “gate-all-around”), its larger width causes less gate control on the edge. On the other hand, the drive current of the nanosheet is much improved compared to both. Current GAA structures should be described as nanosheets rather than nanowires.”
SRAMs push for the compromise. “The nanosheet thickness is something like 5nm and the width is something like 20 or 30 nm,” says Synopsys’ Moroz. “That would be typical for logic. But for SRAM there is no room to have a wide channel, so for SRAM the channel width is going to be 10 nm or less, which would pretty much be nanowire.”
Now you have to deal with the consequences. “The nanowire is better for electrostatics, but the perimeter of that circle is super small,” says Ryckaert. “You need to build that entire gate, and this big source-drain around it, which is going to introduce as many parasitics as you would have in a slab, but for a very poor drive. You’re just going to have a lot of parasitics for a very small current. The nanosheet is a very bad idea for SRAM simply because of geometries. The footprint of the fin is five nanometers. The nanosheet forces the width to be 15 nanometer or 20 nanometer, so that’s just real estate you’re consuming, meaning your SRAM can’t scale with the nanosheet.”
Variability in SRAM causes problems, too. “For logic, there is a certain depth of a circuit,” says Moroz. “Imagine that your transistors along that path are randomly varying, but because you have maybe 15 stages, there is some self-averaging going on. For SRAM, all you have is two inverters next to each other. There is a total of two NMOS and two PMOS transistors, and if they are mismatched, that’s a problem right there.”
There are other problems, as well. “Dopant variability can cause significantly variation in threshold voltage,” adds Mears. “Random dopant fluctuation (RDF) variability can cause significant differences between devices — even matched devices — which leads to lower SRAM performance and yield, as well as adding additional worst-case guard bands to the timing models for a logic device.”
How many sheets?
One additional variable in the manufacturing of GAA is the number of nanosheets. “PPAC (power, performance, area/cost) constraints will push for more layers, especially as nanosheets continue to scale,” says Mears. “For example, assuming everything else remains constant, going from 3 nanosheet layers to 4 increases performance by nearly 33%, yet die size should remain the same and wafer processing costs should only grow fractionally. GAA economics rely on stacking multiple GAA sheets, for effective density, so the pressure is definitely on to increase the number of layers.”
But this is not completely variable. “It’s very hard to believe it’s going to be limited to two, and above five will also be very difficult,” says Ryckaert. “It comes down to simple math. Just by calculating capacitances and channel width will give you 90% of the answer. You also need to calculate how much surface between the source-drain and the gate you need to encapsulate around a certain silicon area. It’s the perimeter that counts for maximizing the drive and minimizing capacitance. Maximizing the drive and minimizing capacitance is simply the surface-to-perimeter ratio. If you compare a three-fin finFET device, there is no nanosheet structure that will beat it. But because of the quantized nature of finFETs, a one nanometer loss in cell height means one fin is gone. The nanosheet gives you the nanometer scaling that you needed for your logic scaling. Then the nanosheet will start shining compared to the finFET. That happens around three to four sheets. Five sheets would not work simply because of the resistance of the source-drain, and the resistance of the structure. You realize that the fifth sheet is just enough to drive the parasitics that you added to make the structure taller. You’re just consuming current in your own structure.”
It also makes little sense to vary this within a chip. “It’s not easy to vary the number of layers on the same chip,” says Moroz. “Once you decide on a certain number, that will likely apply across that chip. For high performance computing, you’re better off with four layers. For mobile you’re better off with three.”
Performance
With each node, there is a desire to reduce voltage and power. “Pressure is always on to reduce the voltage supply, and hence power, but Vt is constrained,” says Mears. “It can’t be lowered much further, because it is set by the Ioff specification and the finite sub-threshold slope (SS), which cannot be less than 60mV per decade due to thermodynamics (kT/q). There is ongoing research into novel circuit elements that would further reduce SS, such as ‘negative capacitance’ from ferroelectric gate dielectrics, but these will not hit volume manufacturing soon. Another constraint on Vdd is SRAM Vmin, which sets the lowest possible supply voltage for a given error rate. Since the embedded SRAMs are typically the first blocks to fail as voltages are lowered, Vmin often sets the minimum supply voltage.”
There will be some improvement in power. “Each subsequent technology for the past decade, and moving forward, will give you something like 20% lower switching power consumption at the same performance,” says Moroz. “Leakage is affected by variability because for leakage what’s more important is the fast corner, where the transistors are leaky. So having tighter variability helps with that.”
But there are unknown aspects about power. “One source of heat is self-heating or Joule heat,” says Marc Swinnen, director of product marketing at Ansys. “With GAA you have multiple nanosheets in these gates and they are surrounded by insulator, which is less of a good conductor. Device self-heat will be different, but we don’t yet have enough information to know how impactful it will be. We will eventually get these numbers from the foundry. A local source of heat can cause thermal spikes, and that can affect electromigration, which is exponentially sensitive to temperature. If locally a few transistors tend to get hotter, then there will be a different electromigration profile in the surrounding metal compared to the chip average. You cannot afford to just use the averages.”
What comes next?
It is clear that as devices shrink, change will become the norm. “We expect to see nanosheets used for at least two nodes, but after that it’s going to get very tricky to scale the nanosheet structure,” says Ryckaert. “We have proposed the forksheet, which is an adaptation of the nanosheet concepts. It has scaling property that will enable another two nodes. Then there’s the CFETs, (complimentary FET stacked), which is inspired by the nanosheet, but in a stacked configuration (see figure 3.)”
GAA may have a similar lifetime to finFET. “Most likely it’s going to be around for 10 years,” says Moroz. “But around 2030, I expect the industry to switch to stacked transistors where you have two GAA transistors stacked on top of each other. Some people call it CFET, complementary FET, or stacked transistors.”
Fig. 3: Logic technology roadmap. Source: Synopsys
That is when it becomes a little tougher. “After CFET we are done with 2D integrated circuits,” adds Moroz. “We expect transistor density to stop around 0.5 billion transistors per square millimeter density for logic, and for SRAM that would be 1 billion per square millimeter. And then we are stuck, because while you can squeeze transistors as much as you want, everything is going to be limited by the wires connecting transistors together. The only way forward would be stacking chiplets.”
Reference
1. Samsung announced it will introduce GAA FETs at 3nm. Intel and TSMC plan to introduce them at 2nm.
Hi!
I’ve a question regarding this quote from Victor Moroz:
“We expect transistor density to stop around 5 billion transistors per square millimeter density for logic, and for SRAM that would be 1 billion per square millimeter.”
Did he really say 5 billion transistors per sq. mm, or is it just a typo and he actually meant 0.5 billion transistors? The reason I’m asking is because the logic density chart in your article tops out at 500 million transistors and not 5 billion and I’m just a little bit confused!
Thanks in advance for clarifying.
I think you are right Sam. We will put it down as my mistake and I will check with Victor just in case. I will get the main body corrected. Thanks for pointing it out.
5000 transistors per um2 is 5 billion per mm2. I didn’t like the millions per mm2 and instead always preferred talking about transistors per um2 since it made a more approachable number. 5,000 transistors per um2 maybe possible but would require a 14nm gate pitch and 14nm fin pitch. So feels out of reach for now but I still claim there is plenty of room at the bottom