Integrating Data From Design, Manufacturing, And The Field

Knowing where circuits came from, and the conditions in which they operate, can help designers optimize devices already in the field.

popularity

Chip design is starting to include more options to ensure chips behave reliably in the field, boosting the ability to tweak both hardware and software as chips age.

The basic problem is that as dimensions become smaller, and as more features are added into devices — especially with heterogeneous assemblies of chiplets running some type of AI — the potential for thermally induced structural damage and accelerated electromigration becomes more acute. The way around that is to build some level of resiliency into these systems, leveraging data collected from monitors inside and outside a chip or package combined with silicon lifecycle management technology.

That adds some significant challenges, however, in collecting, managing, and utilizing data throughout a chip’s lifetime, and especially integrating all of it in a way that can be used by multiple different stakeholders across the design through manufacturing flow.

“The primary consideration is being able to identify this element and walk through every stage of manufacturing, where it’s been in its life, as well as what combinations of things are combined,” said Marc Hutner, director of product management, Tessent Yield Learning, at Siemens Digital Industries Software. “You know where it was on a wafer because of its x and y coordinates. You know what package it’s in, and what groups of chiplets are combined with it, as well. Then you can combine all those elements, because all the data associated with each of those chiplets is also associated with that traceability element.”

That data needs to be set in the context of other chips, packages, and wafers, as well — essentially tying potential issues back into the entire design-through-manufacturing flow. “I can start to ask, ‘Is this a trend that I should be worried about?'” Hutner said. “‘Did other things occur in the same way?’”

Where traceability starts
While traceability has been an important issue for manufacturing for years, advanced designs have pushed the need to combine data much further forward. So while traceability has crept forward from RMAs in the field to front-end of line, and from the placement of monitors in a design, it now is starting to permeate much more of the chip design flow. Design teams need to understand issues such as process variation in the fab and plan for it, not just in simulation.

Proper traceability begins before a wafer is even diced, with each die receiving a unique Exclusive Chip Identification or bar code ID. As the die undergo testing, data collection has already begun, determining precise differences in the physical condition of each die and how they perform.

“There is a lot of variation,” said Ashok Alagappa, engineering manager at Ansys. “Semiconductor manufacturing is very sensitive to processes variation, meaning there can be variation coming from the process tools. You can have variation coming from the wafer itself, for example, at the edge or in a deposition step in a wafer fabrication. When you deposit a thin film of oxide, the thickness won’t be uniform from the center of the wafer to the edge of the wafer, so if the central wafer is thick, then the device performance and electrical parameters will not be the same, even though they are designed with the guard-band in mind and with a three-sigma process variation. And it still may not be the same as a chip that’s towards the edge of the wafer. Then, every wafer goes through different tool sets across the manufacturing environment, and there can be variation there, as well.”

Ultimately, this can have a big impact on the overall cost of developing a chip or chiplets in a package. “This is an ability to trace the yield problems and try root cause as soon as possible, because the more failures you’re having during production, the more costly it is,” said Guy Cortez, senior staff marketing manager at Synopsys. “Also, further down the line in manufacturing, if you start to see failures during packaging — after dies are packaged together, the packages can be very expensive — then it becomes even more of a cost problem.”

Tying this all together is complicated. It can involve everything from electronic IDs to which version of IP was used, and it becomes even more complex when it involves different versions of chiplets. For the time being, aside from HBM, most of the chiplets in use are either developed in-house or developed by third parties for a particular specification. But as more commercial chiplets enter the market over the next few years, a plan needs to be in place for characterizing those chiplets. That characterization data then needs to tie in to all of the other data being collected, managed, and leveraged, such as the oxides present at the specific point where a die originated, manufacturing conditions in an etch chamber, and workload utilization in the field.

“Normally you start to do some fault analysis and failure analysis in the labs,” said Cortez. “Eventually they might know which die or which issue is causing the problem. As you start to bring in the data from manufacturing, assuming the supply chain is all connected somehow, you can go back and ask for the data on that one particular die. All of a sudden, you might notice this one was on the fringe of fulfilling certain tests, but it was allowed to pass through manufacturing, and it ended up in the end product where it should have been probably bent out in production.”

The nature of those performance aberrations can be highly specific, which is useful for pinpointing a problem in one device or manufacturing process, as well as providing a short-term fix and a potential long-term workaround in a manufacturing process.

“The very common methodology is to monitor process in terms of what the variation is within a chip, whether that particular chip is in a certain field environment,” said Ansys’ Alagappa. “Is it operating at a slow corner or a fast corner or a typical condition? Then there are voltage and temperature monitors. Those parameters, for example, can be utilized if a particular chip was manufactured at the edge of the wafer and there are process variations from the center to edge of the wafer. Maybe a particular chip is in a certain application that is from the edge of the wafer, and the process variation from center to edge is much more than by virtue of that if it is in a slow condition. Then, based on these measurements and monitors, they can adjust the voltage.”

In automotive chips, over-the-air analytics can be acquired rapidly, allowing for assessment testing while the device is being used. That can allow for data collection and feedback that is device-specific nearly in real-time. However, the modern automotive sector is still relatively immature, and at this point it’s difficult to picture how the analytics might advance as cars age in the coming years.

Analyzing and implementing the data
Collecting the necessary data is only the first challenge. Who owns that data, and who has the right to see it, has been problematic across the chip industry for decades. And that problem only becomes worse as the amount of data increases proportionate to the complexity of a design and the manufacturing and packaging processes.

On the design, verification, and testing side, the amount of data is exploding, dragging the design world into a data management problem far beyond anything it has dealt with in the past. put it, “As designs get bigger and bigger, documentation is another aspect that engineers really don’t like to do,” said Simon Rance, general manager and business unit leader for process data and management at Keysight. “The end result is a poor job, because they either don’t have enough time to do it, or it’s an afterthought that they have to document the design, the verification plan, as well as the verification process.”

Solutions are being developed to minimize the data management and traceability issues, but the bigger challenge will likely involve integration of data from design all the way into the field and back again. That requires a complex set of skills, and it will likely open up new jobs around identifying patterns and using those patterns to optimize circuits.

“This could be a data analytics engineer’s dream job,” said Alagappa, especially as artificial intelligence-based tools have emerged that can aid in pattern detection. “Based on the inline auditors, thickness, data, process data, electrical data, test data, there is a lot of data analytics that goes into place at a system level or field applications level to see what could happen based on the parameter of interest from the output standpoint that is causing the chip to behave outside the typical operating conditions. The traceability is not a problem, in my opinion, because it can be done, but it’s obviously more and more data.”

The availability of such comprehensive data could be used to improve the performance of systems that are already in the field via updates. It’s an approach that is “constantly evolving,” said Alagappa.

Aside from devices already in the field, assembling this kind of data can be a valuable resource when it comes to designing future iterations of a device, improving performance and efficiency, as well as extending their lifespan.

“There’s always huge promises with silicon lifecycle management from understanding design style, and understanding its interaction with a technology node,” said Hutner. “What are common failure modes? By using certain kinds of memories or certain kinds of transistors you could really feed forward a lot of information by understanding how well it’s working. That’s true not only on the product, if it’s a long-lifetime product like in automotive, but if you had an evolutionary architecture on the same node, you could get that learning feeding back, as well.”

As devices advance to a more advanced node, the data can enable tweaks in accordance with the process node, as well, which would be particularly useful in package of chiplets developed at different nodes.

“It’s even more important as you move into the chiplet era,” Hutner said. “There are some use cases like combining chiplets, knowing which IDs came from which ones. You might do more things. People talk about things like die matching, and you can now double check that those were the right things that were supposed to be die matched at that point. Where we used to get away with things like scribing stuff on the lid of parts, having a chip ID and a traceability ID is the next thing, because you can’t look on the top of an MCM and know where all these chiplets came from. You have to do it all electronically with some traceability element.”

That, in turn, can be used to improve the performance of a device on multiple levels. “They use data to improve the current generation of product based on the parameters from both the test and the field,” said Alagappa. “It is also being utilized to improve that particular product’s functionality in that operating specific operating conditions. Whatever the impact might be of the environment and operating conditions for that product, that is also being optimized based on the SLM.”

The data also can be used to identify the root cause of problems at an especially granular level.

“You start to go into these higher level systems, but you still need to be able to trace what’s going on at the lower level,” said Synopsys’ Cortez. “If you start to have issues at that system level, then how can you go back in time and see if you can trace the source of the problem?”

Conclusion
Data management has become a critical element in optimizing performance for devices in the field and for future iterations of those devices and their components. But the next step of tying all of that data together can open up huge new capabilities that span from design to field and back again.

However, that will depend on finding the root causes of failures and understanding what went wrong, potentially in only a handful of cases across millions of devices under different use cases and workloads. This is where AI will become essential, and the impact will be felt on all levels if this approach turns out to be as successful as its proponents envision it.

Related Reading
SLM Evolves Into Critical Aspect Of Chip Design And Operation
Silicon lifecycle management applications and techniques are gaining traction as chipmakers figure out how to use them more effectively.



Leave a Reply


(Note: This name will be displayed publicly)