Understand the different DFT technologies to know when to insert them into a design.
In the real world, we are slaves to our environment. The decisions we make are dependent on the resources available at any given time. In school, I remember coming up with a binary decision diagram (BDD) variable-ordering algorithm that relied on partial BDDs. Was that the best algorithm to determine the variable ordering of a BDD for a design? Probably not. However, it was easy to do as a colleague was doing research on partial BDDs, and I had the software available that he had written. When I wrote the paper, it was written as though it were the best way to determine the variable ordering of a BDD. If you pay attention to the technology with respect to the environment where it was developed, you will see the relationship between the two in nearly every situation.
Should IC test be completely implemented at the RTL level? This question keeps coming up, and one needs to pause and think, why is that? Is it because one vendor does everything at the RTL level? Did one correlate the environment to the solution, or is it really the best location to do all the test tasks? In this post, we look at some of the technologies implemented in IC test and attempt to determine the right way in which it should be implemented—independent of the environment.
Choices available
Any test logic inserted at the RTL level has the advantage that it can be functionally verified at a very early stage in the design. Furthermore, there are benefits to having the test logic treated the same way as functional logic during the remainder of the flow.
IC testing is about manufacturing defects. Given the high-quality requirements for IC testing, there is a need for test functionality to be designed for the exact implementation. The further the test is done from the exact layout of the design, the more ineffective the solution.
The gate level (logical gates such as AND/OR/INV/FF) is where test generation and fault simulation algorithms have found efficiencies that enable automation for very large designs.
Scan insertion
The most basic type of DFT done in IC testing is scan insertion, where FFs are connected in single/multiple chains to load and unload test stimulus and response. This design creation step requires the timing of the MUX introduced by scan insertion at every FF to be incorporated in the functional optimizations made on the design. Furthermore, the scan chains created today utilize functional shift registers and multi-bit FFs of the design whose optimization is determined during the synthesis process of the design. If scan chains were defined in the RTL level, it would be very difficult to merge the separate functional description of the scan chain with the optimizations done in logic synthesis. Hence, the best place for scan insertion is that this step is done during the synthesis process. Since the scan design requirements are easily specified by several chains or maximum chain length constraints, scan becomes one of the few IP creation constructs performed in a synthesis environment, which is a translation function.
Scan also requires the chains to be layout-aware. With flows today, that allows for swapping of FFs across scan chains through a scandef flow between synthesis and layout, and the scan chains functionality can remain in the synthesis step without disrupting the full-flow.
Wrappers
Wrappers are special scan chains that isolate blocks from one another during the test application process. A test scheduling strategy of blocks determines where and how wrappers are to be inserted in a design. As a result, the locations of the wrappers are known at the early stages of a design. Due to the overhead of the wrapper FFs, it is important to merge the wrapper FFs with functional FFs when the functional FFs are on the boundary of the wrapped hierarchy. This merging of sequential elements is not a normal function of a synthesis step, and, hence, would require special synthesis implementation. Synthesis of large designs that use wrappers is done on a block-by-block basis where wrapper insertion best fits the flow. Therefore, wrapper insertion is best performed in the intermediate step of synthesis when synthesis is working on blocks.
Test points
Test points represent the addition of logic gates for controllability and additional fanouts on nets for observability in the functional logic of a design. Since there is a difference in the timing impact of a control and observe point, it is important to make trade-offs between the type of test point with the timing criticality of the paths they impact. Thus, one cannot identify a test point independent of timing information of the design. The insertion of test points is best done during synthesis.
Another aspect of test points is the analysis of what test points to add in a design. The goal of a test point is to improve the fault coverage of a design or improve the test pattern count of automatic test pattern generation (ATPG) for the same fault coverage. Test point identification should only be done at the same level as ATPG, which means that test point analysis should be done at the gate level. Given the sensitivity of the test point to the algorithms performed during test generation, it is also important that the netlist used to determine the test points is the same as that when ATPG is performed. Thus, one should not use a different synthesis engine to obtain netlists for test point identification than the one used to create the final netlist for ATPG runs.
Compression
Codecs for scan compression sit in the scan path between the scan inputs/outputs and the internal scan chains. The codec itself is defined by the number of scan chains it is connected to. Theoretically, the number of scan chains is determined by #FFs/max-chain-length. If this math is correct, the codec can be defined before the scan chains are constructed. However, this math is subject to many design constraints such as clocking, power, use of pre-existing scan segments and use of multi-bit FFs. As a result, when one expects n scan chains, the reality is the design would either get slightly more or slightly fewer scan chains. To align the codec to the scan construction, the two need to be in sync. To prevent ugly multi-pass flows, the codec should be constructed after the number of scan chains are determined during the synthesis flow.
While workarounds to the mismatch in scan chains to the codec can be determined, they only work for a few limited scenarios. If the codec is completely combinational, a codec can be created before the scan chains without complicating the flow by building a codec that can connect to more scan chains than the flow can create. In such a case, the synthesis optimization would remove the dangling logic of the codec after the scan chains are connected. This solution would not work for sequential codecs in the general case.
The codec-to-scan chain connection represents a high fanout connection and a situation where widely distributed scan chains are brought together. This creates a problem for layout, especially in the high scan compression implementations. Thus, it is very important that the codec implementation is in sync with the physical flow of a digital design. Since synthesis deals with physical information, the codec implementation should work with it in the synthesis flow.
MBIST
Memory BIST (MBIST) is the predominant way embedded memories are tested in a design. Given that all the information is available very early in the design flow, the ideal location for MBIST implementations is in the RTL such that it can be validated early on in the design.
LBIST
Like compression, logic BIST (LBIST) functionality sits in the scan path connected connecting some sequential logic to many scan chains. As a result, its implementation is ideally suited at the same location as scan compression. Given that several designs implement both LBIST and scan compression and there are benefits in merging the two IPs, LBIST is best implemented exactly in the same way scan compression is implemented.
Summary
There are many aspects of DFT logic that are implemented in designs today to successfully test digital ICs. Each of these technologies has different requirements and are optimally implemented in different locations of the IC implementation flow. Constraining them to all be implemented at the RTL level or all at once is a sub-optimal decision, which can lead to complicating the implementation flow or in sub-optimal results for test. In this post, we found that for each technology, there are some dominating drivers that determine the best location for implementation. For the implementation of scan chains, the most important requirement is to work with multibit FFs and pre-existing scan paths.
Where a technology is implemented also determines the amount of control one must have on the solution. For example, scan chains can be defined explicitly in the RTL, in which the user has complete control over every aspect of the scan chain, versus being able to control the chain lengths and mixing of clock edges with simple commands given to automated implementations.
Thank you for the article. Maybe it could also mention specialized DFT features, such as IOBIST, involving embedded test pattern generators and test pattern analyzers for specific circuitry, such as SerDes or MSIP. Since such features are closely related to the functionality of the circuits themselves, I feel that specialized BIST must be specified as part of the microarchitecture and implemented in RTL.