Ditch The Glitch

Techniques and tools for improving test robustness and silicon health.

popularity

To support the ever-growing performance demands of cutting-edge applications like automotive and hyperscaler, SoC complexity continues to increase. The emergence of multi-die technology has also compounded this complexity. To keep up with these demands, design-for-test (DFT) logic must also evolve to ensure greater levels of test robustness and silicon health. The “Shift left” concept which focuses on doing everything earlier in the design, verification, and analysis phases aims to improve manufacturing test quality and reduce cost. This however is only achievable if the tools are capable of providing the level of reliability analysis required to make the process viable. Specific techniques in DFT including checks to detect glitches and capture X values early in the design process are two areas that can effectively be done earlier with the right DFT strategy.

Glitch detection is the process of identifying and preventing glitches in the register-transfer level (RTL). Glitches can be caused by a number of factors including asynchronous signal transitions, clock domain crossing (CDC), sequential logic and timing errors.

What are RTL glitches?

RTL glitches can be defined as spikes or dips in a signal and usually have a very short duration. They can be caused by several factors including defects in the device that is being tested or noise interference. If glitches are not detected, it can lead to missed defects and false failures.

There are two main types of glitches: static glitch and dynamic glitch. In static glitch, the signal temporarily changes its value while it is supposed to remain static at logic 1 or 0 levels. In dynamic glitch, the signal oscillates while changing the value from 0 to 1 or 1 to 0 before reaching the final value, and this occurs for a very short duration.

A glitch may also occur if the latch EN-pin is not explicitly disabled, the latch may potentially not be treated as transparent, and the path can also be traced through its D-pin. Catching glitches early is essential to avoid failures before the gate level simulation (GLS) stage: if problems are detected at this point, they are far more difficult to fix. If there is any combinational logic in the clock path, this can also cause a potential glitch. Glitch checks are also sensitive to the unateness of the paths.

Compared to glitches, X values can be even harder to catch. X-Capture checks can be used to verify that the device under test is properly capturing data from the test environment and behaving as expected in the presence of the unknown values.

Challenges

There are multiple challenges associated with RTL glitches.

  • Glitches are difficult to detect as their short duration makes them difficult to detect using traditional simulation techniques.
  • They can have a significant impact on the final design, even a small glitch can cause the design to malfunction, resulting in data corruption and other related errors.
  • They can be difficult to reproduce in a simulation or a test, glitches can be caused by a wide variety of factors which we will cover in the next part of the blog.
  • They can be difficult to fix, once a glitch is identified it is difficult to fix without making significant changes to the design.
  • It is hard to differentiate the real vs fake glitch scenario.
  • Glitches in a design contribute to between 30 – 70% of the total power consumption.

There are other scenarios where glitches can occur, for example, if you have a flop and the clock line, so reconvergence could be one of the issues of glitches. The other could be maybe due to multiple clocks going to a combination logic, so this can also cause glitches.

There are several different classifications of RTL glitches:

Clock merge

Merging of test clocks may reduce clock pulse width such that the pulse does not have enough energy to trigger a launch or capture the flip flop. Clock merging also potentially reduces test coverage, as clocks are pulsed when required for scan operation specifically during scan. Functional mode ensures single clock propagation through clock gating cells. In test mode, both the clock gates are enabled hence leading to the failure of ATPG patterns on tester.

During ATPG dynamic clock grouping is enabled by default which helps in reducing the pattern count. However, when two clocks are grouped together, there is a possibility of capturing an incorrect value in case of flops which use a data input affected by another clock domain. Hence, there is a necessity in masking off all these disturbed flops. However, in doing so you again propagate an X for some patterns. To avoid this scenario, no-disturb clock grouping can be applied as solution to resolve this issue.

Advanced clock grouping using clock stagger approach can improve per pattern coverage within an interval which can solve most of the test time related issues.

Fig. 1: Clock merging glitch.

Reset glitch

A glitch on a combinational gate can occur whenever the two signals toggle simultaneously or when the same signal re-converges through non-unateness. It is important for the design endpoints to be safe from glitches, i.e., an asynchronous reset should never have a glitch that momentarily resets a flop. Such glitches on the reset pin can lead to metastable state and an eventual chip failure.

Fig. 2: Example of a glitch propagation.

DFT logic glitch

The illegal re-convergence checks for the presence of combinational re-convergence to/from an arbitrary node. The user can define arbitrary nodes like async clamps, analog to digital, or other paths prone to glitches. Unless the latch enable pin is explicitly disabled, the latch gets treated as potentially transparent and the path is traced through D pin, hence causing a glitch at the output.

Fig. 3: Glitch propagation to/from arbitrary node example.

User-defined Point2point glitch

In this case the user can define arbitrary nodes, for example async clamp, analog to digital, or other paths prone to glitch. The illegal reconvergence constraint checks for the presence of combinational reconvergence from/to an arbitrary node.

Edge inconsistency glitches

Edge inconsistency glitches can occur if there are cascaded CDC. In these situations, there may be some on the edges which could be corrupted. To deal with this there is a dedicated rule to identify those occurrences.

Fig. 4: Edge inconsistency / bus contention glitches.

Mode transition

There may be glitch issues that arise due to mode transition, for example if you’re shifting something so a particular signal which has the value set to 1 in your scan shift mode, but in capture mode you are moving to 0, this could also cause some glitch activity.

Fig. 5: Clock and reset paths can be glitch-prone due to mode transitions.

Solution

The ideal glitch detection solution would provide a wide range of checks and X capture capabilities to ensure the reliability and quality of the design, for example clock merge missed and DFT logic glitch check. Various glitch scenarios should also be detectable with different sets of rules both at RTL and gate level. The solution should also suppress glitches when the inertial delay of gate exceeds the differential input delays. Other useful features would include rules that specifically cover CDC and RDC scenarios as well as options for finding and addressing missing synchronizers.

Glitch detection and X-capture are now available as a new features of Synopsys’ testability analysis solution, Synopsys TestMAX Advisor. The new test robustness capability provides improved reliability of the test itself.

As well as the latest glitch detection and X-capture capabilities, Synopsys TestMAX Advisor also excels in other areas of testability analysis:

Fig. 6: The feature set for a complete testability analysis solution.

DFT Violation Checks – Detects and reports the parts of RTL code that can cause testability issues such as Clock violations, reset violations and lot more of such things. Designers can then implement changes to ensure the design is test ready.

ATPG Coverage Estimation – Provides coverage estimation for each block and if it is lower than the targeted coverage, the designer can investigate the testability issues in that particular block and fix it early on.

Test Point Selection – Identifies those areas in the design that are ATPG testable but hard to test and will provide a report indicating coverage improvement versus test point count.

Connectivity Validation – Analyses and validates path connections and flags any path which is illegal. This feature goes beyond just DFT connectivity checks and extends to general connectivity check in the design.

Built on SpyGlass technology, Synopsys TestMAX Advisor, provides easy-to-use, and comprehensive process for resolving RTL design issues, ensuring high-quality RTL with fewer design bugs. In addition, the method leads to fewer but more meaningful violations, thus saving time for the designer.

TestMAX Advisor forms part of the wider Synopsys Test and Silicon Lifecycle Management (SLM) solution which encompass integrated tools, IP, and methodologies to monitor, test and analyze SoCs, providing actionable insights at every phase of the device lifecycle. These innovative test and analytics tools enable a unified flow that is securely connected to Synopsys’ Fusion Compiler for deep insights, from in-design to in-field, meeting design, test, and operational goals for the entire lifespan of a silicon device.

Conclusion

RTL glitch monitoring is extremely important because it can help to quickly identify and diagnose glitches in the design. Glitch monitoring and detection can be used to detect glitches in real time which can help mitigate the risks of failure.

Glitch monitoring can also be a valuable tool for improving tool for the reliability and robustness of the test itself. By detecting and diagnosing the glitches, DFT engineers and RTL designers can prevent them causing issues in the first place, which in turn can help to improve the quality and performance of the design and minimise data corruption and development costs.

To learn more about Synopsys TestMAX solutions, visit our website.



Leave a Reply


(Note: This name will be displayed publicly)