New Technology Accelerates Multi-Die System Simulation

Distributed simulation enables a large job to be run in smaller parts.


AI-powered chatbots. Robotic manufacturing equipment. Self-driving cars. Bandwidth-intensive applications like these are flourishing—and driving the move from monolithic system-on-chips (SoCs) to multi-die systems. By integrating multiple dies, or chiplets, into a single package, designers can achieve scaling of system functionality at reduced risk and with faster time to market.

Multi-die system development is not without its challenges, and verification is one area in particular that stands out. Considering how exhaustive the verification process must be in order to catch detrimental bugs and produce high-performing designs, one can only imagine the impact that a 2.5D or 3D chip might have on this process.

However, a native framework now available in the Synopsys VCS functional verification solution provides distributed simulation that allows you to run a large simulation job in smaller parts. This approach eliminates capacity limitations while minimizing errors for better multi-die system outcomes. Read on to learn more about distributed simulation, including how NVIDIA improved simulation performance 2x using this new technology.

Distributed simulation adds horsepower to the functional verification workhorse

Simulation is typically the functional verification workhorse, catching most of the bugs in a design. As multi-die systems have become more mainstream, chipmakers have created approaches to tackle the typical steps in the development process. The size of multi-die systems makes them too large to simulate in one run, so many chipmakers have opted for a divide-and-conquer approach. However, this manual approach has entailed writing scripts, and putting the results together from multiple runs can be time consuming and error prone. In addition, the methodology typically doesn’t allow for reuse.

Distributed simulation technology supports multiple participating executables without any restrictions on the identity of the executables. In other words, the same simulation executable (simv) could be run N times, with different simvs, or some mix as needed. It supports the interconnection for both register-transfer level (RTL) and the testbench and yields better performance compared to an approach with monolithic executables. While an in-house solution could take weeks to set up, distributed simulation can be up and running in just a few days and requires fewer memory resources, saving the costs tied to large-capacity hosts and farm machines.

The way that distributed simulation works is fairly straightforward. Users partition their simulation into multiple executables. For the compilation, each simv is compiled with an additional switch. A runtime configuration file specifies which RTL and testbench portions to connect. During the run:

  • The simv instances, which are essentially monolithic SoCs, can be launched independently in any order, passing specific options as switches
  • A leader simv, which utilizes a separate switch, controls the simulation and does the heavy lifting
  • The leader simv invokes a separate server process to control the communication
  • All of the results are fed into a single executable at the end of the run, providing a single view of the simulation results

Use model for distributed simulations.

For a more detailed look at how distributed simulation works, watch this video.

This spring, at SNUG Silicon Valley 2023, NVIDIA discussed its use of distributed simulation on a multi-chip GPU system. The legacy approach involved multiple steps and required substantial time and memory resources. Simulating each individual chip was already resource-intensive, and doing so for the entire multi-die system needed more than 2x the anticipated memory and runtime.

Kartik Mankad, a senior verification engineer at NVIDIA, noted during his SNUG Silicon Valley talk: “Using Synopsys’ distributed simulation technology, it was very easy to spin up different types of multi-die systems, without requiring extra effort by participating teams. We were able to avoid integration and reuse problems and retain full functionality of each single-chip environment. Compared to our legacy approach, we experienced a 2x speedup in simulation with Synopsys VCS functional verification solution’s distributed simulation capability.”


As applications like AI and high-performance computing proliferate, multi-die systems are providing chipmakers an avenue to address increasingly heavy compute workloads. In chip development, simulation has long been the functional verification workhorse, exhaustively detecting bugs to achieve designs that perform as intended. But multi-die systems present new challenges.

Given their complexity and size, multi-die systems are simply too much to simulate in one run. As a result, chipmakers have turned to solutions developed in-house to break up their simulation runs. However, these approaches can be manual and time-consuming. Now, a new technology available in the Synopsys VCS functional verification solution is demonstrating its ability to accelerate simulation of multi-die systems by up to 2x compared to legacy approaches. Distributed simulation enables design teams to run a large simulation job in smaller parts, saving time and engineering effort as well as costs associated with large-capacity hosts and farm machines. While multi-die systems present a way to push semiconductor innovation forward, verification technologies such as distributed simulation can help chipmakers optimize the results of their designs.

Leave a Reply

(Note: This name will be displayed publicly)