An Inside Look At Testing’s Leading Edge

FormFactor’s CEO peels back the covers on AI, 5G and HBM test issues.

popularity

Mike Slessor, president and CEO of FormFactor, sat down with Semiconductor Engineering to discuss testing of AI and 5G chips, and why getting power into a chip for testing is becoming more difficult at each new node.

SE: How does test change with AI chips, where you’ve got massive numbers of accelerators and processors developed at 7 and 5nm?

Slessor: A lot of the AI stuff that we’ve been involved with is definitely at the advanced nodes. The massive parallelism of these devices is driving people to use the most advanced silicon nodes, especially for the logic part of it. We’ve seen some projects that are really mind boggling in terms of die size and transistor count. For a probe card manufacturer, that drives a very large number of probes. To get all of the power in, there are some thermal control issues associated with test. But that whole test interface for logic chips is now approaching the numbers and scale that we saw in memory only a few years ago. State-of-the-art memory probe cards a few years ago had maybe 50,000 or 60,000 probes per card. We’re actually shipping probe cards with that number of probes to test some of these AI engines, as well. The scale of these things is astounding.

SE: These are reticle size plus stitching to make them even larger, right?

Slessor: Yes, these are some very, very large die. When you think about the density and the sheer power you need to get into the chip, and then the number of interfaces it has to the outside world — these are very complicated interfaces, and therefore very complicated tests and test tooling. FormFactor made some really big investments in MEMS probe technology, which is really the only cost-effective way to get to the kind of densities and performance levels that people need to test these kinds of chips. There are technical challenges, but it’s a really interesting field to be in because you’re working with some of the leading customers in the world to push this forward.

SE: But now, instead of just dealing with the chipmakers, you’re dealing with the companies making the chips, the systems those chips are in, and the people who are designing everything and writing the algorithms, right?

Slessor: For the test part of the business, with any given engagement you got a very interesting customer set. It’s usually multiple entities at different geographic locations that you are working with at the same time. There is the fabless design house, the foundry, and depending on where the test is being done, there can be a testing house, too. And a lot of the test program development does go into some of the spaces of algorithm development and then how the end system is going to be deployed. There are a lot of engagements where you don’t have one customer. You have three customers for a single design. And everybody has to work together and communicate well for these things to work.

SE: Is it more iterative than it has been in the past, almost like what an IDM can do?

Slessor: Yes, and that’s true not just for AI. It’s happening across much of the high end of the industry. For most chips, we see multiple design spins early in their lifecycle to change some aspect of either the architecture or the silicon itself. That has a significant impact on the probe card and various test interfaces. AI is one example where we’re seeing a lot of iteration, and that’s true of any place where you’re on a rapid innovation curve. But for other applications, whether it be RF, apps processors or microprocessors, there’s still a tremendous amount of iteration early in design cycles.

SE: With RF, what’s happening with millimeter wave? What are you testing for, because signals don’t carry very far and they are easily interrupted by moving object or even weather.

Slessor: Millimeter wave means different things to different people. For example, we’ve done auto radar projects up to 77GHz. No one is thinking of 5G carrier frequencies there, but when you get up into the high-20, low-30GHz carrier bands, some of the issues you mentioned do start to come to the forefront. A lot of what we’re testing involves these antenna-in-package modules that do a degree of beam steering. They’re essentially phased arrays of multiple inputs and multiple outputs that try to steer the beam around different obstacles that will absorb signals at those frequencies, whether it be your hand, your head, or something else in the signal chain. A lot of work has been done here, and we’re starting to test those kinds of chips in pilot production. Some of the new handsets apparently are capable of some of these 28GHz wave bands. There is silicon being pushed out and being effectively tested at those frequencies. We’ll see how well it ends up actually working, because if you need a cell tower every 30 feet, and that distance drops to 15 feet when it rains, that’s probably not such a good implementation.

SE: What’s going on at the most advanced nodes. We’re hearing about 5 and 3nm development. What challenges does that raise from a test perspective?

Slessor: A lot of those challenges are the same themes we’ve seen as we’ve gone from 20nm down to the first finFET nodes, and then to 10 and 7nm. One of the biggest test challenges customers have is the combination of power density and thermal control — trying to make sure that as they test the chip, the transistors and the overall chip are operating at temperatures that are consistent with regular use cases. You can run a set of test vectors through a chip and produce an immense amount of heat, which then causes some weird physical interactions with the overall test. And so both thermal control, as it relates to throughput and test strategies, and then the amount of current you’re able to get in and out of the chip, are some of the biggest challenges we see at those advanced logic nodes. There is tremendous power density.

SE: And that power density also leads to some incredibly complicated power management strategies involving different power rails, heat being pushed out in certain directions you can’t always predict, always-on circuitry, and load balancing across a chip. How does all of that impact test?

Slessor: That’s why there’s a lot of iteration. Multiple power rails, multiple power domains are the primary thing driving these large probe counts in the probe cards. You don’t just have a density problem where you have to get a tremendous amount of current in and out in local areas of the chip. You also have to do it at different voltages and at different frequencies. In some cases these chips have a dozen different power supply domains that you have to supply, all of them populated in different spatial areas of the chip. For example, they’ll have us build a probe card that maybe tests one or two die early in production. And then as they build some level of confidence on what they really have to test, and how much power they really need for the test programs they’re going to run — and as we start to increase parallelism and move from that 1-die card to an 8- or 16-die card — the probe count per die goes down quite a bit as they start to depopulate and these different power supplies get supported in different ways by the tester. So a lot of it becomes an adaptive process as customers learn how they’re going to test these things and begin to ramp up.

SE: Is there any indication at this point what happens when we swap from finFETs to gate-all-around FETs?

Slessor: Given the different physical nature of the structure, you’ll see different failure modes. But by and large test is abstracted from the individual transistors, so I don’t know that you’re going to see a big difference. Obviously, anytime the industry goes through one of these major architectural changes you typically do see a bump up in test intensity. They’ve got to figure out what the failure modes associated with this new transistor structure are, and the overall physical architecture. We’re saw some of that at the first finFET nodes, whether that was 22nm in the microprocessor space or 14/16nm in the foundry space.

SE: In addition, some of the circuitry is always on, there are potential interactions between different structures there that have never existed at this density, and there are much tighter tolerances. How does that affect test?

Slessor: A good example of that is power delivery and/or power impedance specs in the probe card. If you’re trying to deliver all this power into the chip from the test instrumentation, the path through which you do that becomes a lot more important and the specs need to be a lot tighter. Five years ago the power delivery performance of the probe card wasn’t that big a deal. Now it is a major design consideration, and it’s one of the reasons why we end up working so closely with the fabless customers. They understand the different tolerances and specifications that each of those power supplies will have. Some are a little bit looser, some need to be extremely tight. As you design the probe card specific to each individual design, that becomes one of the major performance concerns.

SE: This also is a growing problem with packaging, right? What does that mean for test?

Slessor: Almost all of what we do is either bare die or full wafer, so the package ends up being a downstream piece of this. But you do see a lot of the same implications associated with the power delivery ending up being constraints in the package design. Historically, packaged substrates were pretty simple structures — a couple of layers of metal, not a lot of really stringent design requirements. That’s changed. Now we’re seeing much more complicated packaging substrates with more layers. Via densities are going way up. You need to manage and tailor the power supply performance. This is becoming a big deal — not just for testing at the wafer level with a probe card, but also in the final package.

SE: Are you involved in HBM testing?

Slessor: We are. Our DRAM probe card business has been positively impacted by that. It’s one of these good examples of where advanced packaging strategies and heterogeneous integration drive a much higher level of test intensity. You can imagine that if you’re going to stack eight DRAM die on top of some sort of base controller die, by the time you start getting up into the top die of that stack you want to have very high confidence that the component die you’re putting on top of that stack is pretty close to good, or can be made good through some sort of redundancy or repair. That’s an example where we’ve seen our DRAM probe card business go up rather significantly over the past couple of years.

SE: What are you testing for there?

Slessor: You’re testing to make sure that each of these component die are functionally good, or good enough to be repaired in the final package. And because they’re being fabricated on fairly advanced nodes — at least 1x or 1y nanometer DRAM nodes, the yields are not great. And so it’s a simple functional characterization of making sure that the die that go into the stack are as close to good as they can get. I’m reluctant to use the term ‘known good die’ because it conveys the notion of a perfect thing, and nothing in the semiconductor industry is perfect. There’s a balance of cost versus risk that people constantly play with, and for DRAM there is some level of repairability and redundancy. So you see all of those different knobs being exercised. But HBM for sure has impacted not just the volumes of our DRAM probe card business, but also the spec requirements as they continue to tighten them up.


Fig. 1: DRAM probe card. Source: FormFactor

SE: Where does FormFactor fit into the packaging flow?

Slessor: Much of what we do is either early characterization through our engineering systems business or the disposition in the fab, or just coming out of the fab. That determines whether a die should move on and become part of the HBM stack, for example, or be packaged into a standard logic substrate.

SE: How about new materials? Do they cause any difference here? We’ve got things like cobalt and ruthenium and new liners and new films coming in. What sort of impact do they have?

Slessor: That’s probably analogous to the change from finFETs to gate-all-around. A lot of those kinds of changes, whether they be material changes or structural/architectural/geometry changes, tend to drive a step up in test requirements and overall test intensity and volume. Not a lot of specific new requirements come from it, but they do drive up the need to test these devices and characterize them so that people have a library of failure modes and defect modes. That allows them to figure out how they’re going to test them more effectively.

SE: In markets like automotive and industrial, there is a push to test over time rather than just in the fab. How is this filtering down to other markets?

Slessor: For the reliability aspect of some of these newer devices, people are still figuring this out. But there are things analogous to burn-in and accelerated testing, either through temperature or vibration testing. They are figuring out different ways to accelerate the most likely failure modes in the field or things that people are concerned about. I’ve had conversations with several customers, but at this point backing up the full functional tests for these devices, both electrical and optical, is where we are focused at this point.



Leave a Reply


(Note: This name will be displayed publicly)