Simulation Replay Tackles Key Verification Challenges

The enormous size and complexity of today’s SoC designs put constant pressure on simulator runtime and memory consumption.

popularity

Simulation lies at the heart of both verification and pre-silicon validation for every semiconductor development project. Finding functional or power problems in the bringup lab is much too late, leading to very expensive chip turns. Thorough simulation before tapeout, coupled with comprehensive coverage metrics, is the only way to avoid surprises in silicon. However, the enormous size and complexity of today’s system on chip (SoC) designs put constant pressure on simulator runtime and memory consumption. A single simulation test may run for hours on the most powerful compute servers, and there are thousands or even millions of such tests required before signoff.

In response, electronic design automation (EDA) vendors have innovated to speed up simulation algorithms, optimize memory usage, and exploit parallelism whenever possible. Another way to simulate more efficiently is to run only a subset of the design. This may seem counter-intuitive, but it turns out that there are some important verification tasks that do not require simulating the complete chip. This reduces turnaround time (TAT) and increases throughput. Manual partitioning of designs is time consuming and error prone, so an automated approach is required.

Simulation replay has emerged as a key method for simulating partial designs. This process involves:

  • Localizing a block in the SoC design (or an element in the SoC testbench)
  • Capturing the corresponding simulation activity around that sub-design in a fast signal database (FSDB) trace file
  • Automatically creating a new sub-design testbench from the captured simulation activity
  • Applying the simulation activity to the sub-design in standalone simulation

As detailed in a previous post, gate-level simulation (GLS) to calculate power before design layout is an established application for simulation replay. GLS is much slower than register transfer level (RTL) simulation, and can only occur after the entire RTL chip design has been completed and synthesized. Capturing the activity from normal functional RTL simulation and replaying it in GLS eliminates all manual efforts to port the chip-level testbench or create a block-level testbench. The power simulation can run on any sub-design, so it can be performed long before the entire RTL design is complete and synthesized.

There are several other simulation challenges that can be made much faster and easier with simulation replay. One example is debugging a Universal Verification Methodology (UVM) testbench and its checkers. When a test fails, it might be due to a design bug or to a coding error in the testbench. With simulation replay, the entire design under test (DUT) can be replaced by an automatically generated testbench stub that replays simulation activity. Taking the design out of the picture results in faster simulation performance and much shorter TAT when debugging the testbench. Since UVM code is complex, multiple tries may be needed to fix issues, making quick TAT even more important.

A similar approach is valuable for assertion IP (AIP) validation. The use of AIP is usually motivated primarily for running in formal verification, where SystemVerilog Assertions (SVA) can be used to specify both targets and constraints. It is important to run any AIP in simulation as well to validate that formal analysis won’t consider scenarios inconsistent with the testbench. By enabling fast simulations with only a stub DUT attached, TAT is reduced and much faster iterations result. If AIP errors are found, the debug time is reduced just as it is for simulation testbenches.

Another task that is historically very slow is simulation of the test vectors produced by automatic test pattern generation (ATPG). The test programs used in production testing require both stimulus and results. ATPG tools generate the stimulus, and the results are gathered from full-chip GLS with back-annotated full timing. One way to speed up time to results (TTR) is to skip repeated simulations of the long initialization sequences required for SoC designs. Initialization activity can be captured for one test pattern, and then simulation replay can be used to run only the differentiated portions of the remaining patterns. The generated testbench loads in the state values from the end of the initialization runs to ensure accurate results.

The final challenge occurs when the design incorporates IP from outside sources, which is the case for virtually all SoCs. When simulation tests are failing, the SoC verification team may not be using the IP correctly, or the IP itself may have a bug. Either way, it is common for SoC customers to ask their IP vendor for help in debugging. The problem is that the complete chip design and testbench cannot be handed over since both typically contain both highly proprietary chip content and IP from other sources. With simulation replay, the SoC team hands their provider an IP-level FSDB. The IP verification team can run fast standalone simulations to see whether their design contains a bug or whether it is being misused.

All five of these “use cases” could theoretically be handled by manual extraction of activity information and creation of task-specific testbenches. Clearly, this is a very inefficient process that consumes precious verification resources. Even worse, many of the steps must be repeated every time that the design or testbench evolves over the course of the SoC project. A completely automated simulation replay flow, with optional setup files that provide fine-grained control for advanced users, is essential for any modern chip design.

Synopsys VC Replay incorporates all the capabilities of Synopsys PowerReplay, a well proven solution for early power calculations using GLS. Recent additional features and innovative technologies have extended the solution to handle all the tasks and use cases described above. It automates activity capture, sub-design testbench creation, and execution of the faster, smaller simulation. Synopsys VC Replay runs 10X-100X faster than full-chip, full-testbench simulation, dramatically reducing TAT and debug time. The result is a “shift left” of the verification process for shorter SoC project schedules and faster time to market (TTM).



Leave a Reply


(Note: This name will be displayed publicly)