Early Cycle Analysis And Verification Of Logical SEU Mitigation

Traditional approaches of redundancy within the chip are quickly becoming cost prohibitive.

popularity

The global appetite for data continues to soar, driving innovation across all industry sectors, including how space-based technology can facilitate a more connected world.

Miniaturized satellites configured into constellations offer faster communication and higher bandwidth than lone satellites flying higher in geocentric or high-earth orbits. However, industry analysis suggests that to make this business model viable, holistic cost reduction is required, including identifying cost optimizations in the development and fabrication of semiconductors used within these platforms.

Evolving mission parameters have spawned new interest in leveraging commercial, high-reliability processes that meet survivability requirements while simultaneously providing higher performance, lower power consumption, and more functionality per die—all at a lower cost and reduced development timeline.

Unfortunately, the advantages delivered in lower geometries come at a cost, and one of those drawbacks is that the underlying hardware is more susceptible to soft errors, commonly referred to as single event upsets (SEU). Traditional approaches of redundancy or triplication on salient (if not all) functions within the chip are quickly becoming cost prohibitive.

Fortunately, new flows and automation provide project teams early insights into the effectiveness of logical SEU mitigation and offer the ability to optimize the SEU mitigation architecture, also referred to as selective hardening.

Fig. 1: Driving trends to selective radiation mitigation.

First, let’s review the challenges.

Selective hardening challenges

Feedback from the aerospace industry suggests that the traditional approach to logical SEU mitigation has many pitfalls and leaves two important questions unanswered.

  1. For the design elements known to be mission critical, how effective is the implemented mitigation?
  2. How can I identify the potential of failure due to faults in design elements not protected?

The traditional approach to logical SEU mitigation is best summarized in a three-step workflow.

  1. Identify failure points through expert driven analysis
  2. Design engineers insert the mitigation (HW and/or SW)
  3. Verify the effectiveness of the mitigation
    • Simulation leveraging functional regressions and force commands to inject SEUs
    • Post-silicon functional testing under heavy ion exposure

Fig. 2: The traditional approach to logical SEU mitigation.

Unfortunately, the traditional approach has multiple drawbacks, including:

  • No common measurement (metric) which determines the effectiveness of SEU mitigation.
  • Expert driven analysis is not repeatable or scalable as complexity rises.
  • Manually forcing faults in functional simulation requires substantial engineering effort.
  • An inability to analyze the complete fault state space using functional simulation and force statements.
  • Late cycle identification of failures when testing in a beam environment alongside limited debug visibility when they occur.

Automation and workflows supporting selective hardening

The overarching objective of selective hardening is to protect design functions which are critical to mission function and save on cost (power and area) by leaving non-critical functions unprotected. Boiling that down a level, the methodology has three aims:

  1. Provide confidence early in the design cycle that the mitigation is optimal.
  2. Provide empirical evidence that what is left unprotected cannot result in abnormal behavior.
  3. Deliver a quantitative assessment detailing the effectiveness of the implemented mitigation.

Siemens has developed a methodology and integrated workflow to deliver a systematic approach in measuring the effectiveness of existing mitigation as well as determining the criticality of unprotected logic. The workflow is broken up into four phases.

Fig. 3: The Siemens logical SEU mitigation workflow.

Structural Partitioning: The first step in the flow leverages structural analysis engines to evaluate design functions in combination with the implemented hardware mitigation protecting the function. The output of structural partitioning is a report indicating the effectiveness of the existing hardware mitigation as well as insights into the gaps which exist.

Fault Injection Analysis: Mitigation which could not be verified structurally are candidates for fault injection. In this phase, SEUs are injected, propagated, and the impact evaluated. The output of fault injection analysis is a fault classification report listing which faults were detected by hardware or software mitigation and which faults were not detected.

Propagation Analysis: The SEU sites left unprotected are evaluated structurally under expected workload stimulus to determine per site criticality and its probability to result in functional failure. The output of propagation analysis is a list of currently unprotected faults which were identified to impact functional behavior.

Metrics Computation: Data from structural, injection, and propagation analysis feed the metrics computation engine and visualization cockpit. The cockpit provides visual insights into failure rate, the effectiveness of the mitigation, and any gaps that exist.

Every semiconductor development program has unique characteristics. The methodology described above is flexible and highly configurable, allowing project teams to adjust as needed.

Conclusion

Mitigation of single event upsets continues to challenge even the most veteran project teams, and this challenge is exacerbated as design complexity rises and technology nodes shrink. New methodologies exist to provide quantitative results detailing the effectiveness of SEU mitigation.

For a more detailed view of the Siemens SEU methodology and the challenges it will help you overcome, please refer to the white paper, Selective radiation mitigation for integrated circuits, which can also be accessed at Verification Academy: Selective Radiation Mitigation.



Leave a Reply


(Note: This name will be displayed publicly)