Swimming In Data

What’s happening at the extreme end of the data overload world and why that’s so important for big data management.


By Ed Sperling
So many warnings about data overload have been issued over the past decade that people generally have stopped paying attention to them. The numbers are so astronomical that increases tend to lose meaning.

Nowhere is this more evident than in the semiconductor metrology world, where files are measured in gigabytes. And at each new process node, as the number of transistors and features increase—and the number of measurements increase—so does the amount of data that has to be stored and processed.

At the extreme edge of this barrage of data is critical dimension-scanning electron microscopy. Its files are measured in tens or hundreds of gigabytes. Just being able to decipher this much data requires a higher level of abstraction. There is no equipment powerful enough to process it within a reasonable time frame, and no way to retrieve that data quickly. In data terminology, it has to be mined just to be useful.

A scanning electron microscope, or SEM, takes measurements by sending out an electron beam, which interacts with electrons in the material being scanned. That sends back signals, which are mapped by the equipment. The more critical dimensions that need to be mapped, the greater the amount of data that needs to be processed and stored.

Source of the problem
“Everyone wants to use in-line characterization because you can see what you’re measuring,” said Carol Gustafson, metrology sector manager at IBM. “Other tools can’t show what it looks like. With CD-SEM you get numbers and images. A characterization lab can’t produce the quality of information that we can, but it’s a lot of data. And in the development phase, integrators and engineers like to look at in-chip information because it expands the number of targets you can look at.”

A lot of data is an understatement. Even IBM, which is credited with inventing data mining, won’t tackle this problem without help.

“In an area like this within IBM, at a 300mm wafer fab, with partners going in and out all the time, along with their customers, IT likes to keep a high view of the world,” said Gustafson. “But at 22nm we have more mask levels, and we have to measure more things. That also gives us the opportunity metrology, and that increases the amount of data because now we’re dealing with the complexity of developing and understanding an OPC cycle or budgeting in litho patterning and yields issues. There are also more locations to look at what’s going on.”

IBM isn’t alone in trying to navigate this sea of data. What’s new, though, is trying to make sense out of the data to improve everything from a chip’s performance and power to yield.

“Since you’re already taking the image, it’s a question of how you take full use of that data,” said John Allgair, senior member of the technical staff and Fab 8 patterning metrology manager at GlobalFoundries. “We’ve been able to do traditional measurements of a feature or a line width. We’ve also seen some progress to 2D information or being able to add a contour around a feature, so you get the entire feature instead of a point. But it would be nice if we also could get height information with that. Right now you can partially get there with signal modeling or by tilting, or a combination of both.”

That’s more data, though. Add in more mask layers and increased density and the amount of data goes up yet again.

Both Applied Materials and Hitachi have been working to provide solutions to the data overload problem in CD-SEM. Of the two, Applied is the only one to bundle it with a multi-core machine. That machine contains 1 TB of RAM, which is large even by supercomputer standards.

Applied’s new data mining software, called TechEdge Prizm, provides CD-SEM tool/recipe/process issue identification, issue diagnosis, and toolfleet matching. That delivers the abstraction layer necessary for critical insights within a reasonable period of time. What used to take days or weeks now can be done in seconds, and that’s particularly important because metrology increasingly is moving into the realm of distributions rather than hard numbers.

Consider line-edge roughness (LER), for example, where fixed numbers no longer exist. Rather than just reporting the data, Prizm also provides the context necessary to utilize that data in a meaningful way. “When you’re dealing with line-edge roughness, there is not just one number,” said Paul Llanos, product development manager for Prizm data mining software at Applied. “You can apply different algorithms and get different numbers. Even in production, it requires more work for optimizing measurement strategies.”

LER was never an issue at older nodes because the measurement technology was sufficient to take readings that were accurate enough. But as feature sizes continue to shrink, even minor fluctuations in LER can impact performance for a device layer. There is a direct correlation between LER and resolution of a lithographic image, and like 193nm lithography, the entire manufacturing industry has been searching for a better solution. This is easier said than done, however. CD-SEM so far has proved to be the least damaging technology, which renders it the most trusted choice at current process nodes. Other technologies can take measurements more accurately, but frequently at the expense of the features being measured.

“At 20nm the data is not as clean as it was at older nodes,” said Llanos. “For challenging levels such as via-intrench, you can rebuild the target offline to consider different measurement strategies but only if the source data exists and can be accessed quickly.. This capability was being drowned out by the volume of data. What we’ve done is get smarter about aggregating the data, so you can get a more granular look at the wafer—what makes it run well versus poorly.”

This kind of data is being used in other portions of the manufacturing process, as well. “When we first rolled this out, it was only the CD-SEM engineers who had access,” said Dana Tribula, vice president and chief marketing officer for Applied Global Services at Applied Materials. “Now there are more than 100 engineers who use it on a regular basis.”

The genesis of a commercial version of this technology was less about a product than solving problems, though. Applied originally developed the technology as an internal tool, and IBM has been working with a variant of this approach for some time.

“Prizm puts lots of CD-SEM parameters together for us,” said IBM’s Gustafson. “We can look at multiple levels at the same time and multiple levels for the same litho level. We also can look at CD-SEM images that show changing aspects, patterns, SIT quality, and you can look at the parameters and images.”

The future
While this is a major step forward in managing the data, the reality is that resolution is falling short at advanced nodes. Lithography is both the key—and the stumbling block—to continued feature shrinks.

Approaches such as scatterometry are promising, but they have their own set of problems. The future also may entail a combination of what are now highly proprietary technologies. GlobalFoundries submitted a paper at SPIE that focused on combining measurements from multiple tools.

“The more tools interact the better,” said Gustafson. “The question is whether the vendors are willing to do that or whether IT will have to make it happen. At least we have industry standards on formatting these days. That’s good, because while the resolution is okay at 22nm, at 14nm certain levels are going to struggle. In particular, the critical levels are going to start to fall apart. And as we continue to shrink, measurements will become more critical and more complicated.”

She’s not alone in seeing that. GlobalFoundries’ Allgair said there are serious questions about what happens beyond 14nm. “CD-SEM is still our most frequently used of any tool out there, but if you look at the papers submitted this year there are concerns about its evolution and the improvements that will be possible.”

But no matter what technology ultimately wins out, the reality is that the amount of data will continue to increase significantly with every new tweak and every new technology hurdle. Just managing that data will be a growing problem. Effectively using it will be even tougher.