Aligning IC test and functional safety metrics for ISO 26262.
With the rapid growth in semiconductor content in today’s vehicles, IC designers need to improve their process of meeting functional safety requirements defined by the ISO 26262 standard.
The ISO 26262 standard defines the levels of functional safety, known as Automotive Safety Integrity Level (ASIL), and is a mandatory part of an automotive system design process. The ASIL categories range from A, for the least safety-critical applications, to D, for most critical applications. For example, an in-car infotainment system will have a different ASIL requirement for ISO 26262 certification compared to an automated braking system (Fig 1).
Figure 1: Definition of ASIL classification based on application.
Achieving the required ASIL level requires extensive simulation to analyze the design for potential random faults that could occur within the design and impact its safe function. This process is similar to the fault simulation process already commonly used in the design-for-test (DFT) domain. However, in the context of functional safety, not all faults are equal and this leads to differences between our DFT metrics and our functional safety metrics. There has been no easy way to align the two different metrics. Now, designers can leverage logic BIST (built-in-self-test) to get accurate functional safety metrics to meet the ISO 26262 requirements.
The ISO 26262 automotive standard requires automotive IC designers to add specific circuitry in their designs called safety mechanisms to detect both static and transient faults during vehicle operation, then either respond or shut down safely. Safety mechanisms can take several forms, depending on the application (Fig 2). The selection of a particular safety mechanism requires good design knowledge but can be facilitated through the use of design automation tools. The use of automated software is becoming more widely adopted as part of the safety design process and functional safety design flow.
Figure 2: Chip-level implementation of different functional safety mechanisms.
The ISO 26262 standard provides guidance on the effectiveness of many common safety mechanisms. The selection of a safety mechanism requires a trade-off between several factors:
Safety mechanisms like duplication/lockstep (DCLS) and logic triplication (TMR) have a substantial impact on silicon area, power, and cost. As a result, designers frequently look to circuitry that can be used for both functional safety and manufacturing test, including structures such as logic BIST and memory BIST.
Logic BIST, which is often already implemented on-chip as part of the manufacturing and in-system test requirements, allows automatic test equipment (ATE) access through the IEEE 1149.1 TAP controller, from which the tests can be run and monitored (Fig 3).
Figure 3: BIST infrastructure in an automotive design.
Logic BIST can be used as part of a hybrid approach with scan ATPG compression where the two different test systems share much of the same logic structures (Fig 4). They can also use the same scan chain structures implemented in the design for manufacturing test. Logic BIST has the advantage of being able to generate test patterns internally, making it a good solution for an in-system test solution that can be used as a functional safety mechanism.
Figure 4: Hybrid ATPG / logic BIST controller architecture.
While logic BIST usually can’t reach the same test quality and coverage as ATPG without the addition of test points in the design to help the detection of random resistive faults, it is an extremely effective safety mechanism. Logic BIST can provide very high diagnostic coverage (DC) for the logic under test and is an automated solution that is comparable, if not better, than a custom safety mechanism.
Among the random fault types logic BIST will detect are multi-point faults, including both latent faults and detected/perceived faults. A latent fault occurs within the safety mechanism and an additional safety mechanism isn’t present to catch the fault. A detected/perceived fault is a fault in the safety mechanism that is protected by an additional safety mechanism, commonly referred to as a secondary safety mechanism.
Logic BIST commonly checks for latent faults during key-on or key-off events. But for detected/perceived faults, logic BIST runs during the functional operation of the device. Logic BIST is a destructive technology when enabled, and the operation of the circuit has to be changed to enable the scan chains to become active, destroying data stored in the design’s memory elements. Therefore, the circuit has to be reset following a logic BIST run. This disruption to the functional operation is not ideal.
Logic BIST can also check for single-point faults. While destructive, there are a growing number of applications in which windows of time exist that allow for the execution of BIST technologies during run-time operation. It is the responsibility of the safety architect or safety manager to determine if logic BIST can be reused to detect single-point faults.
There has been no good way to correlate the DFT metrics (test coverage and fault coverage) and the functional safety metrics such as diagnostic coverage, single-point fault metric (SPFM), and latent fault metric (LFM). However, designers can use existing DFT and functional safety analysis software to generate the metrics required for inclusion into Failure Mode Effects and Diagnostics Analysis (FMEDA) to determine accurate failure rates, failure modes, and diagnostic capabilities.
Logic BIST Design For Test metrics
Calculating the test and fault coverage of a logic BIST starts with a complete list of every possible fault within the limits of the IC design or IC design block being tested, based on a particular fault model being targeted. During fault simulation of the logic BIST patterns, the complete fault list is analyzed and simulated so that each of the potential faults is given a specific fault classification. These classifications are what drives the overall test coverage and fault coverage calculations.
Functional safety metrics
The functional safety diagnostics coverage (DC) metrics only consider faults that could directly affect the safety goal of the application. So, the first step is to categorize the faults in the design into ones that are safe faults λs, dangerous faults λspf, and multi-point faults λmpf then fault simulation is performed to determine if the safety mechanism that has been implemented successfully and detects the safety-critical faults.
Calculating diagnostics coverage for logic BIST
Designers can accurately calculate the single-point fault metric diagnostics coverage from a successful implementation of logic BIST by applying the safety-critical fault list from functional safety analysis to the logic BIST fault simulation (Fig 5).
Figure 5: Functional safety logic BIST fault simulation flow.
The resulting diagnostic coverage is calculated according to this formula:
The logic BIST functional safety flow is used to add the functional safety fault classifications to the different structures in the design, which enables the correct level of reporting for the ISO 26262 functional safety metrics.
Functional safety mechanisms are a critical part of the safety design phase for any automotive IC. Logic BIST is a very efficient safety mechanism, providing high coverage for digital logic-based IP. Because it can double part of the manufacturing test solution, using logic BIST as a safety mechanism also reduces area overhead. Integrating functional safety analysis tools and DFT technology improves the ISO 26262 certification process by allowing designers to accurately extract the diagnostic metrics.
I really appreciate all this article about ISO 26262 and how to follow up in testing and ASIL levels, and also all hybrid in/out circuits.
Good work.