How to formally verify embedded memory designs and their redundancy repair schemes.
Innovations in Very Deep Sub-Micron technologies, such as the advent of three-dimensional FinFET transistor structures, have facilitated the implementation of very large embedded SRAM memories in System-on-Chip (SoC) designs to the point where they occupy the majority of the chip die area. To get maximum memory capacity on the smallest die area, SRAM bitcells are designed with the minimum possible transistor sizes and packed together in the densest layout permitted by the manufacturing layout design rules.
However, this combination of smallest transistor sizes, densest layout, and large chip area occupancy increases the likelihood of manufacturing defects, reducing memory yield and reliability. The SRAM bitcell topology and layout are particularly sensitive to statistical process variations during manufacturing and environmental factors such as voltage and temperature during chip operation. As even a single faulty SRAM cell implies an unusable part, a significant portion of post-silicon validation involves fully testing the embedded SoC memories under environmental and operational stress conditions to accelerate memory failure mechanisms and then utilize march test or similar algorithms to locate the faulty bitcells.
Since it is common to see manufacturing defects in memories, it behooves chip design teams not to absorb the costly yield loss of disposing of every part that contains even a single faulty bit, but rather, to design in a scheme that can mitigate some of this loss through a built-in mechanism called redundancy repair. Redundancy repair schemes involve adding spare memory columns or rows to the SRAM. Then, once the post-silicon validation team identifies the faulty bit location via a march pattern, a corresponding repair address is generated and fed into the memory where the redundancy repair address decoder generates the redundancy multiplexor select signals to facilitate switching out the column or row containing a faulty bitcell.
Because of the addition of spare memory cells, the actual size of the memory is larger than the mission mode read/write operation accessible address space. As the memory test operation identifies the faulty bits and enables the redundancy repair scheme to replace them, the mission mode accessible memory space correspondingly shifts over to include the replacement spare columns/rows and exclude the faulty ones. Hence, once the chip is deployed in the field, the repaired memory behaves the same as an unrepaired one. This obfuscation of the redundancy repair from the mission mode operations adds a pre-silicon functional verification challenge to the repair scheme itself since a basic read/write operation cannot distinguish between the data read out from the original memory column/row or a shifted repair column/row.
Consider the example below for a single repair column redundancy scheme. For each column position that represents a fault that must be switched out, the verification diagnostics need to be written to prove the columns to the right of that repaired column were not shifted while the repaired column and the columns to the left of that location were shifted.
From a redundancy verification perspective, adjacent bits cannot be written to the same value during the repair test as it impedes proving that the memory column could be shifted and replaced. A vector sequence that verifies the repair correctness might entail writing and reading an alternating 1/0 pattern to the memory entry containing the faulty bit and then enabling the redundancy mode to decode the repair address to write/read a vector that sustains the previous data in the bit positions on one side of the faulty bitline while writing/reading the opposite vector for the bit positions on the other side of the fault location including the replacement column.
As this multi-cycle SPICE simulation sequence must be performed for each bitline, the sheer number of binary vectors needed to assure sufficient functional coverage necessitates seeking formal verification techniques as a more complete and elegant solution to this verification conundrum. While most formal techniques are already standard practice for gate-level designs, they assume the cell library elements instantiated in the gate-level netlists are already functionally correct. Hence, these techniques are not compatible with purely transistor-level custom designs and cell libraries where functionality may require both correct device connectivity and transistor sizing.
Symbolic simulation, however, is a formal technique that has been proven to work well with custom designs. It can natively support transistor-level netlists and any combination of behavioral, gate-level, RTL, or switch level Verilog. As such, this technique has been widely used to formally verify custom designs such as SRAMs, standard cell & IO library IP. The signature tool that features symbolic simulation for formal verification of these kinds of designs is ESP from Synopsys.
ESP uses a combination of symbolic simulation and formal equivalence check to prove that a Verilog representation of a custom design is functionally equivalent to its corresponding schematic SPICE netlist. In place of binary values, ESP applies a symbol to each input pin which can simultaneously represent all binary values. Then, during the symbolic simulation, these symbols are propagated through both the implemented circuit design and the reference Verilog model, accruing Boolean logic equations for each gate stage along the way, culminating into an aggregate logic equation at each output pin. This equation is a complete representation for all outcomes for every possible binary vector that could have been applied during that simulation cycle.
With multiple symbolic cycles, each subsequent cycle’s Boolean equation also includes all prior cycle symbolic vector dependencies, an important verification advantage for designs with pipeline depth. As the equations are generated, a formal equivalence check is performed between the corresponding Verilog and SPICE outputs to determine if they are equivalent.
Even with symbolic simulation, there are verification challenges associated with redundancy repair. First, many designs either do not model the repair functionality in the memory Verilog model or use a simple behavioral model, requiring the silicon test teams to resort to cumbersome and potentially incomplete binary vector verification. When the repair scheme is modeled in Verilog, that model does not often reflect the structural detail of the redundancy address decoding logic and the repair shifting of the data muxes. Without sufficient implementation detail with the redundancy repair Verilog model, symbolic verification of the repair scheme may not even be provable.
To address this challenge, ESP offers a separate Redundancy Repair Verification Mode that obviates the need for the repair scheme to be fully specified in the Verilog model. In this mode, the tool injects a symbolic fault into the intended bitline, causing the verification to fail. Then, it derives a binary vector repair solution from the symbolic repair address to facilitate shifting away the write/read data from the faulted net, allowing the functional verification to pass. This process of injecting a fault into a bitline and identifying the repair solution is performed on all bitlines within the same symbolic cycle, requiring only a few cycles to fully verify the redundancy repair scheme. Moreover, the final simulation result will not only prove formal equivalence between Verilog and SPICE netlists both with and without redundancy repair enabled, but it will also generate a bitline/repair address solution report that can be used to qualify the correctness of the redundancy scheme intent.
For further information on how to formally verify embedded memory designs along with their redundancy repair schemes using ESP, please visit the Synopsys website.
Leave a Reply