The Single Best DFT Move You Can Make

Hierarchical DFT solves many of the biggest challenges in DFT, pattern generation, and diagnosis for today’s large SoCs.


A proven method to simplify a complex problem is to break it into smaller chunks. In the case of today’s large, complex SoCs, this means using hierarchical methods to design the blocks, then combine the results at the top level. While this sounds obvious, it hasn’t always been practical or technologically feasible to perform some tasks, like DFT, at the block level and translate that work smoothly to the top level. But today, hierarchical DFT is a proven and widely-adopted methodology—if you were to adopt just one thing to significantly improve DFT, pattern generation, and failure diagnosis for complex SoCs, it should be hierarchical test.

Hierarchical DFT expands the divide-and-conquer approach that is often used on the front-end and physical design steps for large SoC devices. It is a way to implement all DFT—BIST and scan for logic and memories—at the core level, but also to perform pattern generation and verification of test patterns for a core or group of cores (Fig. 1). Once the core-level work is done, the test patterns do not have to be regenerated at the top level; they are automatically retargeted. For top-level test, interconnects between cores and chip-level glue logic are then tested separately and the coverage for all test modes is combined into a single comprehensive coverage report.

Figure 1. A design with two levels of DFT, a wrapped Arm Cortex-A75 core and top-level logic.

Hierarchical DFT leads to lower test costs and speeds up DFT implementation through some obvious and not-so-obvious pathways. One is that it helps manage the large design sizes. As the netlist grows, runtime for DFT insertion, pattern generation, and diagnosis increases. Working on core-level pieces solves this. As a nice side effect, hierarchical DFT also reduces the pattern count, which further reduces ATPG and diagnosis runtime.

Related to long runtimes is the pressure on compute resources seen with large designs. Loading a top-level design for DFT, ATPG, pattern verification, or diagnosis can require hundreds of gigabytes of memory, which leaves all the machines with insufficient memory sitting idle. Break up the design hierarchically and the test processes can be efficiently distributed across more machines.

Breaking the DFT work down to the cores can reduce both runtime and memory requirements for ATPG and diagnosis by 5x-10x.

Hierarchical DFT enables another benefit to the design flow by “shifting left.” This is code for getting out of the way of tapeout. A flat DFT methodology delays too much DFT work until after the full-chip netlist is verified, placing it squarely in the critical path to tapeout. Any late changes to the netlist means restarting the ATPG process. Performing the DFT work at the RTL level on cores as they are done breaks the dependency on having a frozen full-chip netlist. You can make a core “DFT complete” with all DFT logic, core-level clocking, wrapper chains, plus pattern verification, as soon as the core is functionally complete. After a core is DFT complete, it is represented as a graybox model to reduce memory footprint. The graybox model can be used in place of a full core netlist in any situation where only the boundary logic is needed, like scan pattern retargeting and chip-level testing (Fig. 2). Using an IJTAG network, the cores are then plug-and-play components that can be reused at any higher level of hierarchy. As a bonus side effect of using an IJTAG network, setting up test mode no longer has to be done manually.

Figure 2. Use parent and child OCCs in hierarchical DFT.

What else can hierarchical DFT do? Yes, it has even more benefits that deserve separate blogs to cover, including better use of the (usually limited) number of chip pins, since cores can be tested in phases. This capability is highly scalable; use as many test phases as needed. Testing in phases also reduces hot spots and test power.

What’s the catch? There are costs to hierarchical test, tasks that are not needed in a flat methodology like flow setup, creating the IJTAG network, and adding wrapper chains and on-chip clock controllers (OCCs) to the cores. These “start-up” costs pay for themselves handily once in place.

While hierarchical DFT improves many aspects of testing large SoCs, it also has an important down-stream effect that DFT engineers might not consider. It significantly improves the throughput of volume scan diagnosis, which leads to better root cause identification of systematic defects in silicon. Volume scan diagnosis is simulation-based, so the design size has a direct impact on runtime and memory demands. With hierarchical DFT, scan diagnosis can be performed at the core level rather than the full chip. The smaller diagnosis jobs can be distributed efficiently to more machines. Result: faster diagnosis, better physical failure analysis, faster path to yield improvement.

Hierarchical DFT is proven on many designs from top semiconductor companies. In addition, there are reference flows and test cases available that demonstrate hierarchical DFT on Arm cores and RISC-V cores.

By taking a hierarchical, divide-and-conquer approach to test, many of the top semiconductor companies have realized significant cost and time savings in DFT implementation, pattern generation and verification, and failure diagnosis.

For details on the components of a hierarchical DFT methodology, download the whitepaper “Hierarchical DFT: Proven Divide-and-Conquer Solution Accelerates DFT Implementation and Reduces Test Costs.”

Leave a Reply

(Note: This name will be displayed publicly)