New approaches for cutting costs and improving reliability for increasingly complex chips.
Experts at the Table: Semiconductor Engineering sat down to discuss the increasing importance of functional test, especially in high-performance computing, with Klaus-Dieter Hilliges, V93000 platform extension manager at Advantest Europe; Robert Cavagnaro, fellow in the Design Engineering Group at Intel (responsible for manufacturing and test strategy of data center products); Nitza Basoco, technology and market strategist in Teradyne’s Semiconductor Test Group; and Robert Ruiz, senior director of product management at Synopsys. What follows are excerpts of that conversation. To view part 2 of this conversation, click here.
L-R: Hilliges, Cavagnaro, Basoco, Ruiz
SE: Why is applying functional test content so challenging today?
Ruiz: There are a couple of different factors that make successfully applying functional patterns on the tester a challenge. In fact, I encountered those many years ago when I was an ASIC designer. A lot of it has to do with how functional patterns are created by design engineers. They’re done on simulators, which don’t have many constraints. So one challenge is that the designer who creates these functional patterns isn’t aware of the physical limitations of ATE. There are limitations on the amount of power drawn, on how many pins can transition, and how many drivers a tester has. There’s a bit of a disconnect between understanding constraints of simulation versus constraints of ATE.
Hilliges: Thirty years ago when we used functional tests, the tester was able to synchronously apply complex waveforms. That type of functional test, of course, is long gone, because now it is about a lot of asynchronous blocks. So then came the era where we essentially just downloaded through digital pins an image of software that could be executed, and essentially let it run and wait until it is done. And then we happily saw a signature passing or not. And that was about the insights that we gained during production test. This becomes unrealistic in the future, specifically, as the context in which the chip runs becomes increasingly important. If you have an accelerator, it runs the workload. Let’s say PCIe host has to drive the workload and run it. So we need a new model of test that involves interacting with functional interfaces with the chip.
Cavagnaro: We moved away from functional tests a long time ago. We’ve used structural-based functional test, which is image-based testing, for almost everything where you have predefined stored response type things going on in the chip. It’s too difficult to manage thousands of pins. It’s too expensive. It’s not manufacturable in high volume. So everything has moved to structural-based functional tests. That presents its own challenges, because now you have reach problems. You can’t reach all the way to the endpoint and back unless you do loopback-based functional tests, which then creates a whole different set of problems because that’s not how the part wants to work on its own. It’s a real challenge, but it’s a required piece of the puzzle. This is just going to continue to be a challenge, especially for the reach. For digital logic, when you start to cross systems, that’s when you start to actually get back into those old conversations of, ‘How do I speak over this protocol to this other thing and have it respond back in a way that actually is deterministic?’ It’s a huge challenge for everybody.
SE: What do you mean by reach? Does it relate to cross-functional blocks?
Cavagnaro: When the test is within a core, it’s very easy because it’s self-contained from the MLC (main-level cache) to the actual logic impact. But when you’re trying to reach an endpoint, for example, you need to go through a PCI Express and back. That ‘reach’ is beyond structural-based functional test, unless you go add extra stuff on the other side of it to be able to reach that content. There are easy things to reach, and then there are really hard things to reach.
Basoco: We always talk about signal paths throughout. You can talk about data paths or whatever, because you just don’t know how far that path is going and what you’re going to hit as you move down that path. From the functional test challenge perspective, do you have an input and an output within that signal path that is easy to actually set up and to output and to start, to stop with enough functionality that it gives you insight to your actual test? We’ve gone over the last two decades changing tests within the various companies I’ve been in, and it’s usually the types of things within that signal path that you can control that really are helpful or not helpful. When you don’t have that sort of signal creation or signal capture, it becomes really difficult.
Cavagnaro: I would add one additional thing, which is electrical on top of it. A large percentage of the time with scan can get your stuck-at coverage. But with functional tests, at the end of the day, you’re also looking for the electrical component. You’re trying to create di/dt and asynchronous timings, because that’s where scan isn’t that great. But to be able to control electricals and timings deterministically on something that isn’t in your immediate control is nearly impossible. So you end up doing these overkill tests. You are trying to really push the part to a point of failure to see what you’re doing. It’s super difficult to control that in a predictable manner.
Basoco: Right. You want to put it into a worst-case scenario for using all these various blocks together. And I agree, sometimes you overkill it.
Hilliges: Rob said that stress is hard to control. Some people are using the portable stimulus standard (PSS) to create content. How do you create content? What’s your experience on the sort of higher system oriented almost at the level of software?
Cavagnaro: The way we create our content is pretty standard. We just wrap assembly machine language with loaders, and then endpoints in unloaders. That part isn’t tricky. But the part that is tricky is when you’re trying to create the electrical stimulus. For example, if you’re trying to create a di/dt event in the middle of your test content so that you’re getting a ground bounce or a power grid plane bounce during speed path testing, you have to inject content on top of your content and interweave it to generate the di/dt, because the content itself doesn’t do it unless you program it to do it, which kind of doesn’t work. So you end up having to interweave content to create this electrical, hostile environment, and it’s just a huge challenge from a software perspective.
SE: How does functional test contribute to cost?
Cavagnaro: The cost of building these parts today is so great. For the more advanced SoCs, we’re attaching hundreds and hundreds of dollars of silicon onto a package. It’s expensive to throw away HBM stacks and to throw away good chiplets that are attached. The cost element has put an immense amount of pressure on catching things faster and earlier. That’s the true insanity of modern-day test development. People expect you to be perfect right out of the chute on products and processes that are absolutely not perfect. You end up being the gasket between the real world and what people think the real world should be. It’s a very difficult place to live some days.
Read part two of the discussion:
What’s Missing In Test
Different types of test all work, but catching every real and potential issue remains a challenge.
Leave a Reply