Anatomy Of A System Simulation

Balancing the benefits of a model with the costs associated with that model is tough, but it becomes even trickier when dissimilar models are combined.


The semiconductor industry has greatly simplified analysis by consolidating around a small number of models and abstractions, but that capability is breaking down both at the implementation level and at the system level.

Today, the biggest pressure is coming from the systems industry, where the electronic content is a small fraction of what must be integrated together. Systems companies tend to operate in a top-down manner, meaning that abstract models of the electronics need to be available before design and implementation starts.

At the same time, within a single 2.5D or 3D package, thermal and mechanical factors are cutting across traditional functional boundaries. It has long been known that physical placement factors into timing. But now, placement is affecting power and thermal, which in turn influences timing as a second-order effect. And these effects are becoming large enough that they cannot be ignored.

In the past, multi-simulator analysis was adopted where models spanned across a physics boundary, or abstraction created resolution problems. A good example of this is mixed-signal simulation, where the digital portion is analyzed in the time domain. The analog portion, in contrast, may require a much finer timescale, or continuous time analysis, to enable it to accurately converge to a solution. Other examples include the integration of different solvers optimized for gate, RTL or behavioral models. Most of these simulation boundaries were rationalized over time so that a single tool could handle them. This was deemed necessary to get the required execution performance.

In some cases, specialized models and analysis solutions remain, as seen at a high level of abstraction where different models exist for system functional analysis, performance analysis, and models that can be used within a high-level synthesis flow. It is rare that any of these are integrated together to solve a specified problem. But that is changing.

A team of Accellera members has started an exploratory effort to see if they can help cross-industry collaboration. Their goal is to exchange knowledge and best practices about these complex simulation problems. They recently held their first meeting in Toulouse, France, and that was the first step in the creation of a working group that could be tasked with defining a solution to the problem from within the semiconductor industry.

New pressures
An increasing number of industries are going through a digital transformation. This allows them to change the design before it gets into a prototyping stage, which can be very costly. “You have to start thinking about the engineering hierarchy,” says Chris Mueth, business development, marketing, and technical specialist at Keysight. “That starts from systems of systems, to systems, to components, and the different engineering disciplines that would fall into those. Taking that one step further, if you want a true collaborative environment, which you need, the automation that you put into workflows has to include the lifecycle, from design to test (see figure 1). There are a bunch of different phases within that span called engineering lifecycle management.”

Fig. 1: Model and analysis transformations. Source: Keysight<

Fig. 1: Model and analysis transformations. Source: Keysight

The time aspect of the development flow is becoming more important. “You want to start earlier because you need to start understanding what can be achieved,” says Marc Serughetti, senior director of product line management for Synopsys. “What are the challenges I may face? We know that building those systems of systems becomes more expensive, and we know that the further in the development process you find a problem, the more costly it is. You need to do things earlier and earlier, but it can only be done earlier with more abstraction because there are things you don’t know.”

This often requires bringing together models of very different accuracy and fidelity. “You might have hardware in the loop in one corner, and a MATLAB model in the other corner,” says Mark Burton, vice chair of the Accellera PWG. “That’s perfectly fine, and the issue is how do you connect those things together. In aeronautics, they do that quite often and have models that are transactional-level mixed with other things in order to build the simulation that they need. This could be for different purposes at different times. They currently do this based on standards set within the avionics industry, but these are not shared outside of that industry.”

Sometimes the model you need may not exist and has to be created. “Customers want to start with rough order of magnitude models, and then turn the screws up on the fidelity on those models, but that comes at a cost,” says Shawn Carpenter, program director at Ansys. “How long does it take to compute those models? In the context of the mission, you’re not going to want to embed a finite element simulation of an antenna into a flight simulation because you’re going to incur way too much time penalty. You have to tune the abstraction of the models. How do I get models — usually reduced-order models — that represent parts of the subsystem that are interesting to me, but which deliver a fidelity that makes my mission simulation accurate, or my system-of-systems simulation accurate?”

A top-down flow involves continuous refinement. “As we talk about the refinement process, it means that the models are being created and refined over time,” says Synopsys’ Serughetti. “Today, you may start with a SystemC model that may be loosely timed or cycle-approximate, and then you refine that to RTL. The SystemC model may let you see your key performance indicators with an accuracy of perhaps 75% or 80%. The RTL model may provide 99%. The objective of that refinement of the model, of the abstraction, is to get toward more accuracy that enables you to explore, validate, and verify and converge on your final product.”

There is no single solution to this problem. “You have to balance the modeling and the abstraction challenge,” says Neil Hand, director of strategy for design verification technology at Siemens Digital Industries Software. “Sometimes, you don’t need the full fidelity of the underlying simulation. You can use an abstraction of it, whether that’s going to be an abstraction that is automatically created, or whether it’s from a new model that is guaranteed compliant. As we look toward the systems of systems verification, it’s not a one-size-fits-all. This creates the challenge of getting everyone to agree on what is needed. In some point cases, having multiple live simulations working together is necessary, just like we see in the SoC space with AMS simulation, where you want to have that full fidelity digital and full fidelity analog. But as you get up to systems of systems, depending on the scenario you are working with, you need to lower the fidelity of the underlying model and the underlying simulation. In some cases, that’s going to mean different engines and, in some cases, it’s going to be different models.”

The analog domain has a large range of abstractions today. “When you look at analog, there really are at least three different levels,” says Deepak Shankar, vice president of technology at Mirabilis Design. “There’s behavioral analog, which is an area that Keysight focuses on, with some Simulink and things like that. Then you have Verilog AHDL, and then you have SPICE. At the top level, one is looking at the precision of the mathematics. The bottom-level SPICE model is looking at the details of the bits.”

We have seen similar abstractions in the digital domain. “Abstraction means what detail you should give up,” says Martin Barnasconi, chair for the Accellera PWG. “You might model a specific sub-system slightly differently, as long as it communicates at an abstract manner to the rest of the bigger system. In the semiconductor domain, especially in the SystemC domain, we have gone through that learning curve, such as how to connect gate-level models to RTL. But many people find out that even this event-driven simulation is too slow, and they move up to the transaction level (TLM). We are seeing that TLM might in some cases still be too low, and we need to go to other more abstract protocols.”

Beyond function
But the problems are becoming much bigger than this. “These problems are both multi-physics, and multi-scale,” says Bill Mullen, distinguished engineer at Ansys. “They’re multi-scale in terms of distance, where you have nanometers to centimeters, sometimes even meters if you’re bringing optical interconnections into play. They are multi-time scale, as well. Thermal effects are much slower than electrical effects, and you need to analyze for nanoseconds through to seconds.”

Advanced packaging is compounding this problem even within the semiconductor space. “With 3D integrated circuit packaging, everything is packaged so tightly together that signal integrity has become one the biggest issue,” says Ansys’ Carpenter. “People are increasingly turning to package-aware circuit design and package-aware interconnect design. You can’t break it up into pieces and simulate the pieces, and design the parts, and then bring them together in the larger system and expect them to work. You pretty much have to pull everything together into one big, lumped simulation to do that.”

The industry does not currently have all the models to do this. “For example, thermal effects are becoming very critical these days, especially as you look at 2.5D and 3D multi-die systems,” says Ansys’ Mullen. “But there are different ways that thermal effects interact with other physics. Increased temperature affects reliability because electromigration becomes much worse at higher temperatures. With higher power density, it becomes really hard to dissipate the heat. You might have large temperature gradients within a single die, and that can lead to mechanical effects. You need to analyze the power, the thermal, and then you get non-uniform heating, which can cause a die to warp. That causes mechanical stress on the connections between chiplets, especially the ones in the middle of a stack. It’s important to couple different solvers together in a way that gives users confidence that their systems are going to work under a wide variety of conditions.”

This is not the first time the problem has been tackled. “Federated simulation, in the system context, has been around for a while,” says Siemens’ Hand. “You might be using multi-physics simulation where it is needed, digital where it is needed, coupled with CFD and sensor analysis, and bringing all of those together. The challenge we have is, ‘How do you interface all of these.’ One solution is the functional mockup interface (FMI), which allows us to connect a lot of the multi-domain verification today. It allows us to put analog into multi-physics simulations, and more recently, that FMI initiative has been extended to add some digital aspects to it.”

It all comes down to the interfaces. “It’s about drawing the lines for abstraction in the right place,” says Carpenter. “What can I separate and potentially pre-compute someplace else, and then bring it into the system of systems simulation without losing too much fidelity? It is important to capture the physics-based coupling across that boundary. You want to draw those boundaries, the abstraction boundaries, in locations where you have minimal coupling.”

This concept is very different than most of the multi-abstraction simulations we are used to. “You have a full-blown simulator of a rocket engine, including how it behaves aerodynamically,” says Marc Swinnen, director of product marketing at Ansys. “You can derive a simplified model that can be used in a global simulation. Then, if they need to get more accurate data at any specific point, you can reach back down to the detailed model, set it up, run that for just the required operating point, and feed that back into the global simulation.”

This is not your father’s simulation framework. “The frameworks of the 1980s and 1990s used an approach where they tried to pull everything together into one model that operates collectively,” says Dave Rich, verification methodologist at Siemens EDA. “I don’t think that has to be the solution to this problem. You could evaluate these models with different abstraction levels independently. Run them independently, and there should be some way to pull all that data together without having to try to get everything to run together.”

Others agree. “When we first think of this idea of multi-domain federated simulation, we jump to the notion of a fully synchronized simulation with full fidelity models,” says Hand. “That is not always the solution. We need a more open view of what’s needed. The same is true when looking just at functionality for all of the different threads. The challenge, and the difficulty, is that there are so many different opinions that we are trying to bring together, and a lot of diverse opinions, and there isn’t a single solution to the problem.”

The problem may be larger than can reasonably be tackled in one chunk. “We first need to tackle the functionality part, because even that is a multi-domain challenge, especially when you look to combine different hardware and different software stacks in the systems of systems,” says Accellera’s Barnasconi. “The software functionality needs to be an integral part of such a framework, not necessarily just as software that is implemented and running on an embedded core, but also higher-level software descriptions. The functional thread might be the easy part, but that is already a challenge on its own. I think moving into physical phenomena, like thermal or analog or other domains, might be for later. I’m not even sure if that can be done, because abstracting those types of concepts calls for things like model order reduction.”

What becomes most important is to ascertain if a set of models and or simulators can solve the required problem. “We need to continue to push the state of the art and find out where can we do better abstractions and data simplification, so that the model can be accurate enough, but also be practical to use in a much larger system,” says Mullen. “There’s going to be continued challenges on that because there is a wide variety of accuracy versus performance tradeoffs. The tools have to provide that knob and allow the user to generate the most accurate model possible, or the best performance model possible, or something in between based on what they’re looking for and their complexity needs.”

Model creation will be a big challenge in the future. “Bringing technology that automates the creation of models is absolutely critical,” says Serughetti. “We’re getting into the era of AI, but the industry has to be careful about the dilemma of how much information you bring in your model and how you can separate that information. We all know that the more information you put in a model, the more accurate it becomes. But now you have issues with simulation performance.”

The industry requires both top-down refinement of models and also a bottom-up abstraction of models. “The algorithm first approach or the modeling first approach is gaining traction,” says Hand. “We need solutions that work from high-level models down. That is different from the SoC guys. The systems houses have to model from a higher level of abstraction to begin with. But at the same time, there are several initiatives that are trying to use machine learning to build abstract models of system behavior automatically from the RTL. There are some interesting capabilities that are now available, and this idea of running simulations to automatically build an abstract model is interesting.”

Related Reading
Industry Pressure Grows For Simulating Systems Of Systems
Increasing complexity and the desire to shift left requires a federation of simulators and models, but they all need to work together to be of value.
AI, Rising Chip Complexity Complicate Prototyping
Constant updates, more variables, and new demands for performance per watt are driving changes at the front end of design.


Leave a Reply

(Note: This name will be displayed publicly)