The History Of CMOS

A look at technology transitions from NMOS to gate-all-around.

popularity

Since CMOS has been around for about 50 years, a comprehensive history would be a book. This blog focuses on what I consider the major transitions.

NMOS

Before CMOS, there was NMOS (also PMOS, but I have no direct experience with that). An NMOS gate consisted of a network of N-transistors between the output and Vss, and a resistor (actually a transistor with an implant) between the output and Vdd. If you are used to CMOS, that might seem like a weird statement since all transistors implement the logic in CMOS, but in NMOS the P-transistor was simply used as a resistor. If the logic enabled current to flow to ground, that would pull the output level down against the resistor. If the logic blocked the current, the resistor/transistor would pull the output up towards Vdd.

There were two big problems with this approach. First, there are paths from Vdd to Vss when the network of NMOS transistors allows current to pass, so a lot of leakage current. This was not fatal when only a few transistors could fit on a chip. The other issue was that the switching speed was limited due to the resistor. When the network of NMOS transistors blocked current, the output was only pulled up slowly due to the resistor.

The solution: CMOS. Replace the P-transistor pullup resistor/transistor with a complementary network of P-transistors. When the network of N-transistors allowed the current to pass, the network of P-transistors did not, so the output would be quickly pulled down to Vss. When the network of N-transistors blocked the current, the network of P-transistor would let it pass and the output would quickly be pulled up to Vdd. The C in CMOS stands for “complementary” since the networks of P and N transistors were complementary graphs in the mathematical sense. When one network had transistors in parallel, the other would have them in series (and vice-versa).

Early days of CMOS

I don’t want to underestimate the difficulties of moving from one process node to the next, but this was the period of what imec calls “happy scaling.” There were actually two scalings going on. One was simply scaling the dimensions of the structures on the chip: the transistors, the vias, the interconnect, and so on. The other was Dennard scaling, discovered by Robert Dennard (trivia fact: he also invented the DRAM). Dennard scaling allowed the power density to remain constant even as the performance of the circuit increased. This was done by lowering the power supply voltage. It depended on most of the capacitance being from the transistors being driven by any output (as opposed to capacitance from the interconnect). So at each process generation, the linear dimensions would decrease by 30%, meaning the area of the design decreased by about 50% (because 0.7×0.7 is basically 0.5), the voltage was reduced by 30%, and the switching time decreased by 30%. So each node was 50% smaller, and 30% faster at constant power density. Happy scaling, indeed.

This went on from the mid-1980s until the early 2000s when we reached “the end of Dennard scaling.” The notion that scaling the transistors would make everything work and we could basically ignore interconnect came to an end. More and more of the capacitance was in the interconnect and interconnect resistance became significant. For various technology-related reasons, it became impossible to scale the power supply voltage as much as Dennard scaling required, meaning that the power density did not remain constant, it exploded. Pat Gelsinger, now Intel’s CEO but back then its CTO, was famous for pointing out that the power density would soon be the equivalent of a rocket nozzle.

Since we could no longer keep the power density under control, we could no longer increase the clock frequencies as we had been able to do for the previous couple of decades. So microprocessor clock frequencies topped out at about 3GHz. Microprocessor vendors delivered increased compute power by delivering multi-core processors. The semiconductor companies assumed that software people would find a way to use these cores, but in fact, making a big fast processor out of lots of slower, smaller, and cheaper processors had been a failed research exercise for forty years. Outside of a few “embarrassingly parallel” problems, it is hard to decompose a single-thread program into multiple parallel threads that can each run on its own core.

Hi-k metal gate

At this point, it is about 2007, and the current node is 65nm. For various crystallographic reasons, polysilicon gates were no longer effective. Although in MOS and CMOS in particular, the M stands for metal, we have not used metal gates for over thirty years. But we transitioned to what was known as “Hi-k metal gate.” The gate was made out of a metal that I don’t think many of us had heard of: Hafnium. The “k” is the dielectric constant of the gate oxide material, and with Hi-k, this could be made thicker without slowing the performance. To keep the self-aligned aspect of the gate that everyone was used to from polysilicon gates, the fabrication actually started with a sacrificial gate that was eventually removed and replaced with Hafnium (this is gate-last, and there was a gate-first version of CMOS too). This approach was used for many process generations from 45nm onward.

FinFET

The planar transistor approach started to have problems with excessive leakage, even with Hi-k metal gate. I’ve read one description as being that the transistors were bright and dim rather than on and off. The solution was to change the transistor completely. Intel led the way at 22nm (under the name Tri-Gate), and the foundries followed at 14/16nm. This was the finFET, so called because the transistor source-drain structure stuck up from the wafer like a shark’s fin. The gate was then laid over the top of this so that it wrapped the channel on three sides. This meant that there were no sneak channels that were far from the gate and so poorly controlled by it. Another approach, pioneered by ST Microelectronics and licensed by GlobalFoundries, was known as FD-SOI. That achieves control by creating the channel on top of a thin insulator (buried oxide or BOX), thus cutting off any sneak channels far from the gate.

Another major challenge loomed, though. Lithography was running out of steam. The industry had gone through two major lithography transition, reducing the wavelength of the lasers used in the steppers and switching from air between the head of the stepper and the wafer and replacing it with water, known as immersion lithography. But both approaches reached a limit at the final stage of 193i, namely light of 193 using immersion. The attempts to go to a lower wavelength were not successful, so the industry was (and for many steps in manufacturing still is) stuck at 193i.

Multiple patterning

At 20nm the minimum pitch was 80nm, and this is the absolute limit on what we could create with a single exposure of 193i light. To go further, double patterning was required. Half the elements in the design were put on one reticle, and half on a second reticle. Both were exposed onto the same wafer allowing a pitch of less than 80nm to be manufactured. This was a dramatic change for EDA tools since they had to do the partitioning of the design into the two masks, known as coloring from graph theory. The simplest approach was known as LELE (litho-etch-litho-etch), but as new process nodes came along, more accurate (but more expensive) approaches were required, known as SADP (self-aligned-double-patterning) and, later still, SAQP (q for quad).

EUV

The great hope to save us from more and more masks, more and more process steps, and more and more cost was a technology being developed in the Netherlands by ASML, known as EUV. But the development was going very slowly, and there was even doubt as to whether it would ever work. EUV stands for “extreme ultraviolet” and used a wavelength of 13.5nm (so not just less than 193nm but a lot less). There were several challenges to be overcome. First, the light source needs to have enough power, or it would not be able to expose enough wafers to work for volume manufacturing. Second, everything absorbs EUV, so the light path has to be completely in a vacuum. And when I said everything absorbs EUV, that means lenses, too, so the scanners needed to use reflective optics. In fact, they needed to use Bragg mirrors, nothing like the mirror in your bathroom or in a telescope. These mirrors only reflected 70% of the light, and, as a result, very little of the light generated made it to the photoresist on the wafer.

EUV was finally introduced in volume manufacturing at the second generation of 7nm (the first generation having used multi-patterning so it was not completely dependent on EUV working), and then 5nm and all subsequent process generations.

GAA

However, finFETs were running out of steam, too. Surrounding the channel on only three sides was not enough. To surround it on all four sides, known generically as gate-all-around (GAA) and by various proprietary names from each manufacturer, meant dividing the channel up into a number of small channels (usually three) and running them as wires through the middle of the gate. It turned out that an elliptical shape was better than circular one and ended up being what everyone uses today. This type of transistor was introduced at either 3nm or 2nm.

This is where we are today (at the leading edge).

The future

More and more of the interconnect is taken up with the power delivery network (PDN), and more and more of the resources required to connect to standard cells are blocked by the PDN. One solution is a backside power delivery network (or BPDN). Instead of using the interconnect stack to deliver power, ground, and perhaps clock, the PDN is built on the backside of the wafer and connected to the frontside with through-silicon vias (TSVs). This is optional, but seems to be getting introduced at 2nm or 3nm.

One big opportunity for a one-time big increase in scaling is the CFET, or complementary FET (nothing to do with the C in CMOS, although both stand for complementary). Instead of manufacturing the P-transistors and N-transistors on the same wafer directly, they are stacked, with the N-transistors on top of the P-transistors, so taking up about the same space as a single transistor. This gives a gain of 1.5X to 2X in density.

So, it looks to me like the end of the silicon roadmap will be GAA + CFET + backside PDN.



Leave a Reply


(Note: This name will be displayed publicly)