Design data and metadata are ballooning, and no one is quite sure how long to save it or what to delete.
The continued unbundling of SoCs into multi-die packages is increasing the complexity of those designs and the amount of design data that needs to be managed, stored, sorted, and analyzed.
Simulations and test runs are generating increasing amounts of information. That raises questions about which data needs to be saved and for how long. During the design process, engineers now must wrestle with the data and metadata of each individual chiplet, the interconnects, the package, and the overall system. This can include multiple substrates, interposers, and non-electrical elements.
“It’s approached almost like an extra layer of hierarchy, so the top level is now top level beyond the top level,” said Marc Swinnen, director of product marketing at Ansys. “The block is now the assembly.”
The chiplet ecosystem is currently almost completely vertically integrated, but that could change in the coming years as the industry moves toward an open chiplet ecosystem. Though the ability to mix-and-match will offer unprecedented flexibility, it also will prove to be yet another data management challenge, with security and IP concerns being added to the mix.
Chiplets generate more data
Chiplets are highly specific in terms of functionality, but they can be assembled in unique ways to target specific domains, applications, and workloads. That has opened up a wide range of possibilities, but it also has spurred new data management challenges. More simulations are needed, which in turn creates more data.
“With chiplets, the advantage is you can have all these different capabilities that together can provide some sort of bigger system or bigger value or solve a bigger problem,” said Simon Rance, general manager and business unit leader for process and data management at Keysight. “The more applications, the more data that has to be created — especially when it comes to testing and verifying the systems as to how will it perform under ‘this situation’ or ‘this application.’ And is everything interoperable? That data just keeps exploding.”
The surge includes both data and metadata, which adds an additional layer of complexity. Chiplets need to be characterized, simulated, and tested in the context of a specific design, which is one of the reasons it has been so difficult to launch a commercial chiplet marketplace. One chiplet may behave differently in the context of other chiplets, or under different thermal or noise conditions, and all of that needs to be documented.
“Everybody thinks about libraries, design kits, and all the physical data that gets managed in some data management system, but it’s also all of the related data, the metadata that goes along with it,” said Michael Munsey, vice president of semiconductor industry at Siemens Digital Industries Software. “You have the files that define what it is, but you have the metadata that defines what the state of things are.”
The challenge of organizing data
While having more data than you need is better than having less, EDA tools have not yet reached a point where AI can be relied on to properly sort and analyze the data on its own. Human error can cause issues, as well, such as when engineers do not document their work as thoroughly as is necessary. With SoCs, keeping track of hierarchies can be built into the system, but that’s not always the case with chiplets. The added complexities of all the various components in and connected to the package can be daunting.
“People only save or think about saving information they think will be relevant in the future, but they can’t be prescient and know about everything that they need to save,” said Munsey. “Typically, what happens is there may be a version of the design that gets lost. When you’re running EDA tools, doing power analysis, it’s very easy to overwrite one log or report file from one run to another run. Sometimes people aren’t as diligent about saving everything they need to save. You may find yourself in a situation where the data may be missing, or you may have to re-characterize it, or there are just things you never thought about in the first place.”
This is particularly true for multi-vendor chiplets that may be in the market for a decade or more, such as those used in a car. “There’s a whole topic that people have overlooked in the chiplet space, which is reliability and liability,” said John Lupinski, vice president of product engineering at Blue Cheetah. “You may have four companies’ chiplets in a package. So now you’ve deployed it in a car and you have a recall. So maybe three companies say their chip looks fine and the other guy has a problem. So now you’ve got to recall the whole unit. Whose fault is it? Maybe it’s a packaging issue. It’s a very complicated space.”
Having data to back up how a chip was designed, changed, tested, in which manufacturing lot, is critical to avoiding these problems. But in automotive that might happen a decade later, and there’s no guarantee that the companies that made the chiplets will still be independent companies, or in business at all.
AI adds yet another level of complexity because the utilization of on-chip/in-package resources is significantly higher than in many other designs. “We’re putting all these different things inside a package now,” said Sue Hung Fung, principal product line manager for chiplets at Alphawave Semi. “And we’re building things out for AI/ML because we’re seeing this huge driver of lots more data in the data center doing the LLM training for AI and for doing inferencing. Depending on what type of cores we put inside of the compute, that’s the kind of performance we can get out of it. So the type of cores and how many cores would depend on that specific application use case.”
This means data collected from one design may be very different from another, versus a mobile phone application processor which may be exactly the same across hundreds of millions of units.
Data is required at the mechanical manufacturing level, as well. Interposers in chiplets are not those typically found in SoCs. As Ansys’ Swinnen noted, the Manhattan-like routers in SoCs often are replaced by “weird octagonal shapes and curved lines to get squeezed in between. As a result, multi-dies have specialized routers built into the tools to handle those redistribution layers. They use the same formats, but essentially they’ve extended them.”
While the difference in interposers doesn’t necessarily create more data, it does require more documentation. Given the shortage of engineers, which allows them to easily change jobs, as well as the aging of the hardware engineering workforce, the lack of institutional memory can cause a major hiccup in the workflow.
“Simulations and tests of the routing, and making sure that it is optimal and doing it in the timing requirements that are set, is where data comes into play,” said Keysight’s Rance. “It’s not really a different problem from an SoC. It’s very similar to PCB design in this area. They’ll just add extra wires, and they’ll do side things to get around the problem. But that will open up vulnerabilities. It probably wasn’t there in the initial spec. A data management system can say, ‘You did this. Now let’s take what you just did and put that in the data management system for you so that it’s recorded.’ Engineers will move from company to company, and they’ll move from design team to design team. What if the next chip that’s designed is another iteration of the previous one? And then the new engineers say, ‘Well, why the heck did we have these three wires over here?’”
That difference creates yet more data, even after the EDA process is complete.
“There’s going to be a design bill of materials (BOM), and the BOM is going to have to keep track of what versions of chiplets are going into the design, which version of the substrate, which interposer, as well as non-electrical things like stiffeners and leadframes and things like that,” noted Siemens’ Munsey. “All this has to come together now and be managed and kept track of at a system level.”
More data means more time
That increase in available data means compute and time become precious resources that must be carefully managed. To this point, Keysight’s Rance said a formal verification run for a large SoC often can take a full day. “Any errors must then be examined and corrected, and the process repeats itself. There’s a bit of irony there, in that this time-consuming process, which is the result of having so much information, only generates more data. You need to know what to keep and what to discard. There’s this iteration loop that happens between design teams and verification teams, and that data explodes very quickly because they’re doing so much ‘what if’ analysis and they can’t get rid of the data because they don’t know if fixing one thing broke another thing. They need to keep the data retrospectively for some period of time to go back and compare and contrast until they converge on a solid design that’s verified. That’s the other part of what we’re seeing. In the data management explosion there’s now this fear that we don’t know what the right time is to delete data that’s outdated or old or not needed, and so they keep it.”
There is some progress on this front. Methodologies are being developed inside of companies, and tools are being developed to identify which data has not been touched for a period of time and probably can safely be trashed.
Challenges ahead in integration
If deleting unneeded data solves one big problem, there’s another daunting challenge on the horizon. Currently, the chiplet market is almost entirely vertically integrated, but the day is likely to come when heterogeneous 3D-ICs are assembled using chiplets from any number of suppliers in a LEGO-like, plug-and-play environment. Ansys’ Swinnen said this will create a challenge for data management due to IP protection concerns, but he believes those challenges can be overcome by adapting some of the best practices already in place when it comes to SoCs that contain multiple IP cores.
“That is a latent concern because most of the 3D-ICs are actually vertically integrated,” Swinnen noted. “The company makes all the chips themselves so there’s no concern about that from a lot of these big customers. But certainly, when you start looking at a chiplet market, you’re going to have to have the same concerns as in IP today, and this will be addressed similar to the way we address an IP. There will be hierarchical formats that simplify the interface description. There will be encryption. We already deal with some proprietary data. For instance, TSMC allows us to do certain analyses on a chip that requires an understanding of the actual layer-by-layer properties, which they don’t want to make publicly available. We read their technology file in an encrypted format. Inside the tool, we do the analysis and only present the customer with the final results of what these net out to. We don’t ever reveal the internals of the calculation.”
Security adds yet another element to the data management problem. Jim Schultz, product marketing manager at Synopsys, said this is particularly tricky in light of current geopolitical sensitivities, which has led to restrictions on certain advanced nodes across borders.
“There’s definitely a security issue where you don’t want to necessarily share data across different ship lists, as different teams are involved,” Schultz said. “Also, they could be different technology nodes. That’s a big sensitivity. For example, if you’re on a technology node that is restricted, certain countries only have access to certain nodes. Even within a company, it’s still a sensitive node, so you probably wouldn’t want to share that. You would probably have to get permissions for certain teams to be able to see that data.”
Munsey has a more optimistic outlook on how this ecosystem will continue to evolve. “Within the realm of SoCs, there are already systems in place to handle proprietary IPs,” he said. “To protect that IP, some data is, by design, withheld from the client that’s using it. It’s not going to be so much as a nice-to-have as a necessity-to have more characterization information that needs to go on with the chiplets. That’s something that’s going to evolve over time, and people are going to realize it’s just not enough for me to deliver the chiplet. It’s going to be delivering the chiplet with some additional information, whether that’s stored in the chiplet design kits, whether that’s stored as metadata that goes along with the design. That’s yet to be determined.”
Conclusion
Data management is a constant challenge, and one that has become even more complex in the emerging chiplet ecosystem. Data management systems must handle data and metadata regarding the testing and verification of numerous chiplet use cases, as well as the finalized heterogenous packages. The increase in data is compounded by even more data from simulation, emulation, and various test runs..
Meanwhile, new tools are racing to keep pace, introducing algorithms that can analyze data usage, and deleting that has not proven useful. That will become more difficult and complex as 3D-ICs begin rolling out over the next few years, and as chiplet-based designs proliferate. Complexity brings more data, and more data adds more complexity.
—Ed Sperling contributed to this report.
Related Reading
AI In Data Management Has Limits
Trusting which data to use and what can be deleted still requires human oversight; startups are at a disadvantage.
Leave a Reply