Simulation: Balancing Speed And Debug

Four primary use cases for different debug needs.

popularity

There’s an old saying about simulation: “It’s all about the need for speed.” Simulation is the core technology for functional verification of semiconductors, and the demand for higher runtime performance never ebbs. Larger chips require more complex testbenches and much larger test suites since verification grows exponentially with increase in design size. With the diminishing return and slower adoption of new process nodes, Moore’s Law no longer guarantees a steady increase in single-processor speed. A recent post described the use of fine-grained parallelism (FGP) in the Synopsys VCS Functional Verification Solution to make efficient use of multi-core and many-core processors across a wide range of simulation tasks. VCS with FGP can deliver simulation speedup of 2-30 times, but this improvement can be compromised if users do not pay attention to their debug settings.

Of course, debug can’t simply be ignored. Every simulation test failure must be triaged, diagnosed, traced to its root cause and fixed. Surveys by Synopsys in recent years reveal that “system-level debug” is one of the top verification challenges, ranking slightly higher than simulation runtime performance. Debug of complex failures requires visibility into the design and testbench, and may benefit from interactive simulation in which users can move forwards and backwards in time while changing the value of signals. Capturing and saving relevant debug information takes time and resources, inevitably slowing down simulation. Higher visibility means lower performance, but this is not an “all or nothing” dilemma. VCS offers a range of debug options; choosing the right setting for the situation yields an optimal tradeoff between visibility and speed.

Figure 1: Multiple options for debug support.

Different users, including designers and verification engineers, have different needs when it comes to debug. The full range of users and their requirements can be abstracted down to four primary use cases. The first, applicable mainly to designers, is post-simulation analysis of the design with the Synopsys Verdi Automated Debug System. Designers spend a lot of time looking at waveforms while tracing signals and their drivers through the design. They analyze test failures in terms of how the design is being stimulated by the testbench but rarely look at the testbench itself and have no need to peek inside library cells. Therefore, dumping values only for the register transfer level (RTL) design is sufficient. With VCS, this basic level of debug has minimal impact on simulation speed.

For more visibility into the design, VCS provides interactive debug with line tracing. Running simulation with this option enables Verdi to display the current values of design signals in the RTL source code viewer. The designer can debug the scenario for a test failure line by line in the RTL, with the waveform and source code views fully synchronized. It is also possible to specify breakpoints at specific lines in the RTL and re-run a simulation with these breakpoints enabled. Verification engineers require many of the same capabilities as designers, and much more. They tend to spend little time looking at the RTL design and instead focus on debugging the testbench code they wrote.

Verification engineers are the primary users of the third case, interactive testbench debug, although some designers may also benefit. Modern constrained-random verification environments compliant with the Universal Verification Methodology (UVM) require support for debugging object-oriented testbenches. Verdi provides numerous features to aid in the process, showing threads, classes, class members, registers, constraints, macros and other aspects of the verification environment. UVM-aware interactive debug is critical, especially given the dynamic nature of many objects in the testbench. Users can set breakpoints in testbench code, both SystemVerilog and SystemC/C++/C. All these capabilities are enabled by running VCS with the class access debug option.

The final and most sophisticated use model is “what-if” analysis with the ability to force values on signals interactively. The reverse interactive debug setting for VCS has the highest impact on simulation speed, but it enables powerful capabilities within Verdi. Users can debug backwards from a test failure with full visibility of the design and testbench back in time. They can move to an earlier point in the test, change signal values, variables and constraints, and re-run simulation from that point. This mode of debug is clearly beneficial for verification engineers analyzing regression failures, but it is also valuable for bringing up and debugging gate-level simulations. The range of debug options available in VCS allows users to maximize performance by specifying the minimal level of debug appropriate to their needs.

Figure 2: VCS and Verdi support full reverse interactive debug.

Beyond debug settings, the other main factor that can reduce simulation performance is injudicious use of the programming language interface (PLI). PLI tasks connect SystemVerilog simulation to other languages, specifically C (and C++/SystemC). Like debug features, use of PLI inevitably slows simulation but the impact varies considerably depending upon usage. PLI comes in two versions: the Direct Programming Interface (DPI) and the Verilog Procedural Interface (VPI). DPI provides less access to the simulation from C but has lower overhead. The type of access also affects speed. Simply reading values has less performance impact than writing values or registering callbacks. The general guidance to verification teams is to use the minimal set of PLI tasks that will get the job done.

VCS provides the capability to qualify PLI access per task rather than incurring maximum access cost for every PLI call. However, testbenches may reference PLI tasks that come from external vendors, so the verification team does not know what access levels should be specified. VCS also provides a powerful adaptive mechanism that monitors PLI calls and learns what level of access is required for each task. The results from all tests can be combined into a single master control file that is used by VCS for all subsequent regression runs. This capability, along with the other features covered above, ensure that simulation users get all the debug features they need with as little impact as possible on the industry-leading performance of VCS.

For more information on simulation performance, download the “VCS Fine-Grained Parallelism Simulation Performance Technology” white paper.



2 comments

Srini says:

Well written article Rohit, glad to see a post dedicated to debugging.

On the last part on PLI access learn capabilities, I sincerely believe that users need to be educated more on this powerful feature. Imagine doing Machine Learning (ML) some 10-15 years ago for HW simulations- that’s precisely what this is and I suggest Synopsys revamps this feature under ML and promote more – will benefit a class of users tremendously

Dan Ganousis says:

The Cloud changes everything because simulators can exploit the massive parallelism available. Traditional EDA vendors will never provide a SaaS (pay by the minute for only actual use) business model in the Cloud so they’ll continue to espouse the “need for speed”. However, users are moving to the Cloud. Here’s a recent quote from a Qualcomm manager: “Multiple simulation jobs running concurrently on multiple servers will ultimately finish faster than running the jobs serially on one slightly faster on-premise simulator. That’s the future of EDA.”

Leave a Reply


(Note: This name will be displayed publicly)