Packetized Scan Test Delivery

Effective and tunable bus-based scan data distribution.

popularity

The traditional approach to moving scan test data from chip-level pins to core-level scan channels is under pressure due to the dramatic rise in design size, design complexity, and test adaptation.

To address these challenges, we now have the option of implementing a packetized data network for scan test that moves the scan data through the SoC much more efficiently than the traditional pin-multiplexing (mux) approach. This chip-level test delivery bus decouples the DFT requirements of individual cores from the chip-level test delivery resources. The results include faster test time, lower test cost and DFT implementation time, and the ability to tune the bandwidth post-silicon to meet changing test needs over time.

How packetized scan test data beats the pin-muxed approach

In the traditional approach to delivering scan test data to cores, each core requires a dedicated connection to chip-level pins, as seen in figure 1. This doesn’t allow for much flexibility, as the dependencies between the cores and the chip-level pins are set once during design. In a bottom-up flow, DFT engineers typically allocate a fixed number of scan channels for each core, usually the same number for each core.


Fig. 1: Traditional scan data delivery.

This is the easiest approach, but it can end up wasting bandwidth because the different cores that are grouped together for testing might have different scan chain lengths and pattern counts. Other problems with this approach include limited IOs available for scan test, limited core-level channels, and potential for routing congestion.

A pin-muxed scan approach does not allow for adjustments that might be needed over time. Test bandwidth might be perfectly balanced based on the initial set of scan test data. When yield ramps up, it might be determined that some cores need additional patterns, and others can get away with fewer. After such an adjustment, the hard-wired configuration is no longer optimal. A similar situation might also occur if different bandwidth allocation is needed in wafer, package, and in-system test.

In contrast, a packetized scan test delivery network offers a more effective and tunable approach. The underlying idea is to deliver scan test data across a uniform network that is connected to all cores or blocks in a design. The data that is shifted in and out of the chip does not look like conventional scan test data, but is organized in packets that the network understands how to translate into more conventional-looking scan data at each core.

Here’s a simple example using one such packetized network called Streaming Scan Network (SSN), in which two cores are to be tested concurrently (figure 2). Block A has 5 scan channels, Block B has 4 scan channels. The packet size in this example is 9 bits. In this example, assume there are 16 pins available for scan test (8 inputs, 8 outputs), so the SSN bus that delivers the scan data is 8 bits wide.


Fig. 2: Testing two blocks at the same time. In a pin-mux scan access method, this would require nine chip-level scan input pins and nine scan output pins. With SSN the packet size is 9 bits, which is delivered on an 8-bit bus.

On the left side of figure 2, you can see how the data is streamed to the cores across the SSN bus. Performing one shift cycle in both cores takes two bus cycles. That the bit location of the data corresponding to each core changes (rotates) for each packet, but the host nodes know what data should go where and when to pulse the core shift clock.

In this approach, core-level host nodes generate the DFT signals locally. The host nodes ensure that the right data is picked up from the bus and sent to scan inputs of the core and that the output data is placed back onto the bus. Each node knows what to do and when to do it based on a simple configuration step leveraging IJTAG (IEEE 1687) infrastructure.

A smooth ride on the Streaming Scan Network

One bus goes everywhere with no dependency on core level DFT resources or deciding which cores you can test at the same time. It’s a bus, on a superhighway, with a dedicated lane. The dedicated highway lane in this metaphor is the SSN.

The SSN establishes a uniform way to send data to a core no matter how it’s nested or the levels of hierarchy. With SSN, the number of scan channels per core is independent of all the other test resources: the width of the SSN bus, the number of chip-level scan channels, and the number of cores in the design. SSN reduces DFT effort and test time, and enables post-silicon bandwidth tuning to enable effective data optimization.

Putting your scan test data on an express bus simplifies planning and implementation and allows core grouping to be defined later in the flow, during pattern retargeting rather than during the initial design, and even changed post-silicon to meet changing test needs. The SSN architecture is flexible—the bus width is determined by the number of scan pins available—and eases routing congestion and timing closure because it eliminates top-level test mode muxing, which also makes it ideal for abutted tile-based designs.

Deciding which cores to test concurrently and which will be tested sequentially is configurable, not hardwired. The configuration is done as a setup step once per pattern set, and once it’s done, all the data on the SSN bus is payload.

SSN reduces test time and test data volume

In many test retargeting schemes, the capture cycles of all the affected cores must be aligned. If multiple cores are shifting in concurrently and they have different scan lengths, some of the cores with shorter chains need to be padded so capture is performed at the same time for all cores. With SSN, each core can shift independently, but capture occurs concurrently once all the cores have completed scan load/unload. This reduces overall test time and test data volume.

Independent shift and capture becomes even more valuable when used together with another SSN capability: Bandwidth tuning. The problem we’re trying to address is similar to the situation shown in figure 3. Imagine you want to fill multiple buckets of different sizes with water from the same pipe. To do this most effectively, you would allocate less water pressure to the smaller bucket, and more to the larger bucket, by adjusting the valves independently.


Fig. 3: Bandwith tuning with SSN optimizes scan test data delivery.

We have a similar challenge with scan test data, and a similar solution. Rather than providing as many bits as there are core-level scan channels per packet, SSN can allocate fewer bits to a core that requires less data overall. For a core that has fewer patterns or shorter scan chains, less data is allocated per packet, which better distributes the data across the cores and ultimately reduces test time. Looking back at figure 2, assume Block A requires significantly fewer patterns than Block B. In this case, rather than allocating 5 bits per packet, we could allocate 4, 3, or even 1. This would make each packet smaller, and the overall test time shorter. The best part is that this bandwidth tuning can be done programmatically: If test data content changes, either as part of yield ramp, or needs in different test insertion, you can always achieve optimal configuration.

Summary

The SSN approach is based on the principle of decoupling core-level test requirements from chip-level test resources by using a high-speed synchronous bus to deliver packetized scan test data to the cores. It was developed in collaboration with several leading semiconductor companies to bring scan test data delivery into the fast lane.

SSN has been real-world tested. Intel reported results of using the SSN at the International Test Conference 2020. Comparing the SSN method to their traditional pin-mux approach, they report a reduction in test data volume of 43% and a reduction of test cycles of 43%. Implementation and test retargeting tasks were between 10x-20x faster.

Additional resources:



Leave a Reply


(Note: This name will be displayed publicly)