Identifying nm-sized defects in a substrate, mixing FA with metrology, and the role of ML in production.
Experts at the Table: Semiconductor Engineering sat down to discuss how increasing complexity in semiconductor and packaging technology is driving shifts in failure analysis methods, with Frank Chen, director of applications and product management at Bruker Nano Surfaces & Metrology; Mike McIntyre, director of product management in the Enterprise Business Unit at Onto Innovation; Kamran Hakim, ASIC reliability engineer at Teradyne; Jake Jensen, senior product specialist at Thermo Fisher Scientific; and Paul Kirby, senior marketing manager at Thermo Fisher Scientific. What follows are excerpts of that conversation. Part 2 is here. Part 3 is here.
[L – R] Kamran Hakim, ASIC reliability engineer at Teradyne; Frank Chen, director of applications and product management at Bruker Nano Surfaces & Metrology; Mike McIntyre, director of product management in the Enterprise Business Unit at Onto Innovation; and Paul Kirby, senior marketing manager, and Jake Jensen, senior product specialist, at Thermo Fisher Scientific.
SE: What challenges exist for failure analysis in advanced packaging?
Chen: Packaging challenges come with scaling. With finer RDL pitches and smaller bump sizes, especially as you go to hybrid bonding, there’s going to be a pull on utilizing high-resolution techniques. Providing micron-level resolution is not that much of a challenge. But to do that with high coverage and high sampling is one of the challenges. Device manufacturers have been trying to find one tool, but this may require two different tool sets. That’s very common at the front end, where you have an inspection tool that’s very high speed and has good coverage to identify areas for review and failure analysis tools. But manufacturers are seeing the need, because they can’t keep up with the high density and the high sampling coverage to guarantee the quality. At some point for the advanced packaging they will start to use two different tool sets. We are addressing the high-speed inspection with full coverage for assembly processes. We are adding machine learning because it’s needed for that analysis.
SE: It sounds like we’re at this inflection point for advanced packaging to adopt the workflows that have long been used in wafer fabrication.
Kirby: It’s not going to be one tool that does everything, by any means, because the length scales we’re talking about are enormous. And that’s one of the challenges we’ve seen. You’re trying to find very small defects, like cracks, but over millimeters of area. There is no one size fits all. And the other aspect is destructive versus non-destructive. We’ve got a very expensive package that could be at a design house. You don’t want to do a whole lot of destructive localization if you can avoid it. You will have to at some point, but postpone that as long as you can in the workflow.
SE: What kind of inspection can be done?
Chen: There is a good in-line X-ray metrology solution for when you’re doing flip-chip bonding, thermal compression bonding. As you get to hybrid bonding there’s a bit of a gap right now. You need something that’s high-enough speed that you can do very large coverage. High sampling rate for hybrid bonding is still very uncommon. Right now, we’re looking at the copper dishing and making sure that the surfaces are clean for hybrid bonding. For RDL, you’re starting to see some limits on optical inspection as you get to the sub-micron line length and spacing. So again, it’s the industry gap. We definitely need to investigate some techniques that are fast enough. All the failure analysis tools are there. No one argues that you can do very high resolution at slow speed. It’s just what can you do over millimeters of length scale.
Jensen: I agree completely. Advanced packaging is one of the most unique spaces in semiconductors. For this kind of failure analysis, we need to span at least six orders of magnitude. We’re going from millimeters to potentially nanometer sized scale defects. So for the FA, we might need to make millimeter-size scale cross-sections. Then on the imaging side, we have the SEM resolution to be able to localize fine defects and really characterize those in our areas of interest.
Kirby: People are using several approaches, and they all have their benefits and drawbacks. We’re seeing a lot more use of lock-in-thermography. The z resolution from the thermal signature will give you a good indicator of where the defect is. Then, to make these access points (with lasers, FIBs), you can do cross-sectioning and look at those interconnects.
Jensen: On the physical FA side of things, we work with our dual-beam and SEM instruments. Once we’ve localized the defect, we need to be able to reveal it in a timely manner. But even more importantly, we need to do this at the specific site and in a robust way. These defects may be one-of-a-kinds so they are something we can’t lose. Think about just milling into something. If we stopped short of the defect, we’ve lost time. And if we go too far, that’s even worse, because we’ve lost the defect. The correct x,y,z localization is important.
SE: If you just look at the packaging portion of the device you have different layers — bonding from the chip to the substrate, then the RDL level down to the another set of bumps. Is there anything about the materials or anything that needs to be considered when you look at the workflows?
Kirby: Definitely. Materials behave very differently. Whether you hit them with lasers of different wavelengths, or different ion species of milling, it has to be very carefully thought out to be able to optimize that and make sure you have the right solution without causing damage to those various materials you’re trying to mill. We have seen customers perform cross-sectional or diagonal milling of TSVs, and then do SEM metrology on those TSVs and packaging structures to be able to find and isolate those systemic defects. That’s happened in the last couple of years.
McIntyre: I would add that it’s taking that narrow form of data over a hundred points, instead of one or two points, and creating reports. That results in a data volume explosive. We now see inspection and metrology layers that are producing north of 10 gigabits of data for one wafer. That’s 200 million bumps with a dozen points of metrology, plus defects, plus imaging. The data volumes are staggering. If we could get that same level of data from a good SEM/TEM, transmitted to an analytic data structure that we can then tie in with the test results, that would be the dream for me.
SE: What are the FA challenges for advanced CMOS processes?
Kirby: We’ve seen technology inflections at finFET and gate-all-around that drive really large increases in TEM sample volumes. We’ve estimated that TEM volumes double every five years. This is due to a couple of things. One is that the shrinking device features drives the need for better resolution techniques. The second is that TEM analysis becomes more valuable with the increased predominance of 3D gate structures and the complexity of 3D gate structures. We’re seeing trends that drive more FA samples in both memory and logic segments. With logic you need better resolution because a defect might only be nanometers wide. Once you localize that defect you next make a sample, which allows good failure analysis. But the defect may be obscured by the 3D structures and surrounding metal. So you’ve got a very complex challenge to be able to isolate a defect in three dimensions, and then correctly analyze it and characterize it. In memory — DRAM or 3D NAND — the challenge is more about aspect ratios, which are getting very tall. The defects associated with etch and fill and stacking of these structures become more challenging. You require both cross-section analysis and a plane-view analysis to be able to see down, and top-down structures, as you go down through the device.
McIntyre: A lot of the conversation is dealing with random defects. A lot of my experience in manufacturing is with systemic issues — for example, where you’re marginally thin in your metal line, or your cross-section is marginally off — and these create the failure mechanism. So it’s not a point failure, and as these systems become more and more complex, you end up creating more and more corners in the bounding box of functionality. And it could very well be in the scales we’re talking about one monolayer of atoms. That systemic issue is creating a difference in what’s functioning versus what’s not, given a certain set of conditions. Then, extrapolating that out, when you start building a chip you expect to go into a wider variety of specs. The same CPU, for example, goes in a laptop, a server, and a desktop. Those are different environments, and the same physical structure may only work on one of those. From an analytic standpoint, that systematic loss also needs to be understood.
Hakim: One of the things that concerns me is that if you look out 10 years from now, we are planning to go into sub-nanometer technologies. How are we planning in the next 10 years to be able to deal with 0.7 nanometer? Because now you put that into a 3D structure, which exponentially increases the issues surrounding the device.
Kirby: The challenges at the actual analytical level can be met. A TEM image can get you sub-angstrom resolution. Where you get all the challenges is in localizing the defect, making a sample that captures that defect in a 3D space. That’s where end pointing and more automation is required, because the number of defects and device types can vary per design. You’re trying to make a TEM sample on the order of nanometers wide. This requires a level of automation not previously needed. Automating workflows at that level is really difficult. That’s why we need to use machine learning.
SE: Are there differences between large foundries versus IDM versus more specialized device manufacturers in terms of FA?
Hakim: The size of the business matters. If you are a big company where the foundry is producing millions of chips for you, naturally they are going to bend backwards for you. The amount of ASIC development that we’re doing is going to be small. As a result, we are not going to have the clout to have that conversation with the big foundry. But we’re going to get to a point, particularly when you’re addressing the machine learning and deep machine learning issues, where as an industry we need to ascend above this. We need to allow information to flow and try to address the problem at that level, because the benefit is going to be two-sided. My design, even though it’s low-volume, might be such a sensitive design that it’s going to pick up things from the foundry process that none of the other designs see. We need to be able to work together. However, I do not believe that understanding is in place at this point.
McIntyre: My takeaway is this is the revenge of the IDMs, because they’re going to have a considerable advantage in that they have the manufacturing history a foundry can’t provide to their customers.
Hakim: The conversation regarding systematic failures points out a data issue. The fab is producing something and they can see the complete test. They can make a correlation between the test and what’s happening in the fab. However, if I am an end user, a foundry is manufacturing the design for me. The only thing I have access to is the process control monitor (PCM) data. They are not willing to show what’s happening in the fab. So I have the PCM data and my test data. Based on these tools I need to be able to make judgments with regard to what the root cause of a particular donut-shaped (defect) behavior that is happening on wafer. Who am I going to go and talk to in order to tell me exactly what happened in the process?
Chen: Some decisions are volume-driven. But when you add up a few small customers, that ends up being a sizable customer. It helps to have everyone aligned around the message that we need investments. The IDM definitely has an advantage in justifying that investment. But for foundries and OSATs, there will be a charge-per-use type of model. But the commercializations can work out so the whole industry can make use of these advancements.
Part 2 of the discussion is here. Part 3 is here.
Related Reading
Journey From Cell-Aware To Device-Aware Testing Begins
Better test quality is required as devices become more heterogeneous and denser and use cases become more critical; tradeoffs are cost and time.
Integration Challenges For ATE Data
Collecting data to boost reliability and yield is happening today, but sharing it across multiple tools and vendors is a tough sell.
Very informative and helpful