Applied Materials’ VP looks at what’s next for semiconductor manufacturing and the impact of variation, new materials and different architectures.
Sanjay Natarajan, corporate vice president at Applied Materials with responsibility for transistor, interconnect and memory solutions, sat down with Semiconductor Engineering to talk about variation, Moore’s Law, the impact of new materials such as cobalt, and different memory architectures and approaches. What follows are excerpts of that conversation.
SE: Reliability is becoming more of an issue with safety-critical AI technology. Is the goal to improve the reliability of individual components, which is the classic divide-and-conquer strategy, or is it to understand what goes wrong in a system?
Natarajan: It’s both of those. Today, traditional von Neumann-based AI is the common approach to improving reliability. The goal is for good reliability with a digital device and digital chip architecture while still scaling. But this approach breaks down when gaining more transistors or power efficiency every couple of years is no longer achievable. The industry needs to maintain good reliability while making things smaller and cooler and this presents its own set of challenges. Pursuing a brain-inspired, neuromorphic-type of approach is even more challenging because it’s no longer possible to hide variations behind the digital framework. For example, no one will notice a variation shorter than the clock speed where transistor ‘A’ might switch faster than transistor ‘B’ as long as they finish switching within one clock period. The digital world basically conceals that variation. With analog, it’s more energy efficient, but since the variation can’t be hidden, it must be eliminated or minimized. This is where Integrated Materials Solutions come into play. In the traditional model for fabricating a chip, each step is a different station. There are separate tools to do lithography, etch, deposition, and measurements. A lot of co-optimization and integration capability is lost when manufacturing is done that way.
SE: What happens when you take a more concurrent approach?
Natarajan: Simultaneous measurement and deposition can reduce variation because deposition stops when the target point is reached. If a chamber is running slowly today and tomorrow it’s running faster, it doesn’t matter. The approach that has the most traction is putting an endpoint system into a process tool or process chamber. Going a step further can involve depositing a material and then treating it. The problem is the moment a wafer is taken out of a chamber or system, the properties of the materials change. Manufacturers are confronting this with cobalt, which is extremely reactive. From a materials scientist standpoint, this is a headache.
SE: This is environmental variation, right?
Natarajan: Yes. Today, a common goal for chipmakers is to deposit and polish cobalt. We make the deposition tool and the polisher, but we don’t control what happens between the tools. That’s our customer’s space. But we know cobalt is extremely sensitive to what happens between those two tools. Different FOUP (front-opening unified pod) environments, and even the transfer time of the wafer from the deposition tool to the polisher, can produce a changed state. This affects how to do the polishing, which in turn translates into variation. Deposition plus polish has a variation very much dictated by the whitespace in the middle.
SE: Scaling can be accomplished by other approaches than just shrinking features. Is there enough benefit from packaging and new architectural approaches to replace traditional scaling?
Natarajan: The answer is that it’s not an either/or. It’s always been Moore’s Law plus packaging plus heterogeneous integration. Moore’s Law is such a powerful lever that we haven’t had to exercise other levers as much. Anything that delivers that much improvement in performance per watt and transistors per chip was sufficient. For 40 years we have done Moore’s Law and dabbled around the edges. There are lots of packaging options and some of them are very innovative. We went from land grid arrays to pin grid arrays, and packaging advances in the smartphone space have been going on for a while. For the smartphone, it’s not just about the apps processor or Moore’s Law, because it’s not only about performance. It is also about taking DRAM and stacking it on an apps processor so it would fit into the product’s form factor. Packaging technology has been marching side-by-side with the horsepower gains of Moore’s Law. Now, classic Moore’s Law scaling is slowing down.
SE: We’ve heard about expectations of a 20% improvement in power/performance over the next couple nodes. Is that possible?
Natarajan: It’s possible, but we’ve gone from a two-year cadence to a five-year cadence. When there aren’t fast enough improvements in the technology, there’s little reason to upgrade. For example, VR is great technology but it’s in a holding pattern, waiting for Moore’s Law to deliver a compelling enough experience to drive the market. With a real VR system, I can visit family halfway around the world and that’s an experience worth having and upgrading for.
SE: Are there alternatives to getting that performance? Can you re-architect a chip with more customization to achieve the same thing?
Natarajan: Yes, but it’s a one-time trick. CPUs have long provided a rate of improvement good enough that it wasn’t worth going in another direction. When Moore’s Law started to slow down, the change was made to GPUs. That’s a first step in the direction for AI. Now the industry is saying GPUs are not fast enough, so many companies are designing their own chips. The execution engine is no longer CPU-generic or sort of GPU-generic. It’s now about multiplying ‘this size matrix by this size vector,’ or the need to invert this matrix and target AI workloads, so there are designs for transistors that only do specific, one-function jobs.
SE: But the starting point has changed, as well, because now the starting point isn’t the hardware or the software. It’s the data.
Natarajan: Yes, but for any one application there is one ideal architecture. That’s not going to evolve going forward. As an example, there is AI technology that will recognize faces in the photos on your smartphone and find all sorts of related photos. And this application is running on proprietary hardware purpose-built for that function. That hardware will never drive a car, but a different AI purpose-built chip will drive a car. So going from CPU to GPU, to purpose-built for a given application, is going to be a one-time improvement. There is no version of Moore’s Law to improve that.
SE: How many more revs do we have? Do we now have to go back and think beyond the chip?
Natarajan: The physics of Moore’s Law is so fundamental that these one-off ideas don’t add up to that. We have to do this out of necessity. One of the areas where Applied is focused is packaging technology, because it offers a reasonable extension for the fundamentals of Moore’s Law. It’s one method to keep things evolving.
SE: So is this still about moving electrons, or is it something else?
Natarajan: It’s a little of everything. There’s the traditional Moore’s Law of scaling CMOS. Integrated Materials Solutions can enable traditional scaling—wires that carry electrons, holes and electrons being the unit of information and transistors being the switch. That mainstream approach will continue and must improve. There’s a body of work with Integrated Materials Solutions that gets scaling back on track in some form. There’s also a body of work that falls into the category of, ‘What else can we do?’ An answer is packaging and materials innovation from Applied, and chip architecture from others. Packaging can pick up some of the scaling load. This can be done by heterogeneous integration—using advanced technology for the most critical parts of a chip, together with high-yielding, relatively low-performing technology for other components. You integrate all of the parts together with high-speed packaging.
SE: How about different memory architectures?
Natarajan: There is also a desire to bring memory closer to the computation, using techniques like analog switches or in-memory compute. We have content-addressable memory in our brains where when thinking about a memory, the entire memory comes up in your brain. With the von Neumann architecture there is addressable memory which only knows where to go to find eight bits and retrieve them. Rather than retrieving bits or simulating something, the brain will build a memory.
SE: If you’re doing memory and computing in context, what happens to some of the classic speed bumps we’ve discussed and encountered, such as variation? Variation may be a problem in some cases but not in others, but you don’t necessarily know unless you have all of the data at your fingertips.
Natarajan: Our understanding of how variation fits into this paradigm is very limited today. That was the beauty of the von Neumann system and why it was so successful. Variation is understandable and testable. With EDA and test tools, we can guarantee correctness even if there are 3 billion switches and hundreds of megabytes of memory. But in the analog space, we don’t yet understand how to guarantee correctness, let alone map variation onto guarantees of correctness. While there are certain tasks where mistakes are tolerable, such as a smartphone not identifying someone incorrectly in a photo, it’s not acceptable for an autonomous car to drive erratically because of a variation problem.
SE: How much of a focus is there on engineering materials to work together as a system?
Natarajan: As a materials engineering company, this is very exciting. What if a whole switch can be embedded inside of the material? If the material itself has a transfer function, there’s no need to build a transistor out of 50 different things that collectively resemble a switch. The material itself behaves as a switch, and manipulating its properties can be done based on material properties we know how to control.
SE: That completely changes the perception of reliability and performance because now you’re dealing with a system.
Natarajan: And it complicates things. MRAM is a good example of this. At its simplest, MRAM is a stack of materials that in certain conditions act as an embedded non-volatile memory. If a core magnesium oxide in the MRAM is two atoms thicker or thinner, it fails and doesn’t work as a memory. Made just right all the time, it works quite well.
SE: Phase-change memory is even more difficult, right, because now you’re dealing with chemical reactions?
Natarajan: Yes, and this is in our wheelhouse as a process equipment company. If there is a need for a layer to be six atoms thick all the time, and we can achieve it, then you can create a class of functionality that is consistent.
SE: At that point you’re either depositing or growing, as opposed to manufacturing this, right?
Natarajan: Yes, but this is where the integration of depositing and measuring simultaneously comes in. For example, if a tool is early in its life through a maintenance activity, it might be hot and depositing fast. If it’s approaching the end of its chamber life, it might be depositing slowly. Either way we can watch every atom go down to ensure consistency.
SE: Still, you also get some insights here because you’re looking at this holistically. What have you found that you didn’t expect?
Natarajan: The potential is even more than I expected because we can fix other things along the way. A natural byproduct of all of this integration is that some problems just go away. Some of this we’re figuring out as we get data. More often than not, the unexpected stuff is on the good side at the moment.
SE: There are two sides to AI. One is how we use it in the real world. The other is how you use it internally to create materials or products. How are you using it?
Natarajan: We’re building AI into our equipment to better handle the billions of bits of data. Tools have sensors and actuators collecting data and reacting to data, and in most systems that happens algorithmically. Emerging AI capabilities will eventually enable an inferential approach where that data can be used to train and infer, so the tool will adjust accordingly based on the field of data detected.
SE: That adds a level of cost effectiveness into the tool, as well, because you understand the failure timetable and can predict it based on previous data, right?
Natarajan: Yes, exactly. You can guard-band it, but you also can see things coming better than with an algorithmic approach. The sheer volume of training data we can get is incredible. The tools generate this data day in and day out, collecting inputs and outputs that matter, and there’s really no need to understand the dynamics of going from input to output because the data is added into a training engine. Then, this can potentially lead to that information being used in the feed-forward mode to control thickness or the phase of a material, or when to stop the tool. This allows controlling the tool as well as the quality and variation on the wafer.
SE: How much of that data can be carried over from 10nm to 5nm or 3nm?
Natarajan: My experience with this is that it essentially has to be re-learned. But it’s not a big issue because so much data is generated so fast that starting over won’t take more than a few days. What’s important is to re-learn what’s unique to each process.
20% per couple nodes? Five years per node? That’s a return of 2% a year on investments of tens of billions when even older semiconductor technologies now can deliver improvements increase by a factor by changing semiconductor architecture and replacing software. Life support is expensive, and in the case of Moore’s Law it is not economical. And it cannot be sustained, not with the China’s massive effort that is beyond the control of US industry.
Reliability improvement depends on variation reductions from all the sources. With scaling at advanced nodes reached atomic level, variability as a percentage of CD’s has increased and can’t be hidden with digital and less so, with analog system. Endpoint allows to mitigate some sources of variability (i.e. run-2-run or chamber-2-chamber) but less effective with others (site-2-site); contribution from the latter (s2s) increased for 300mm wafers and have been a growing pain during last nodes especially for 3D-architectures like FINFET. ALD and ALE are great innovations allowing to dial-in number of added or removed atomic layers, but aren’t completely variation free. Will AI become available soon enough to improve our understanding of variability impact on reliability and contribute to AI system fabrication?