Streaming fabric helps deliver test data at higher speeds using only a few HSIOs.
Semiconductor chips have been evolving to meet the demands of rapidly transforming applications, and so has the test technology to meet the test goals of those chips. Going back two decades or so, the applications were limited and the designs were simpler, thus the concerns about power, performance and area (PPA), turn-around time, re-use and time-to-market, etc., were important but not as critical as they are in today’s fiercely competitive environment. The structural test, with scan chains, was sufficient to meet test quality and cost targets and had minimal impact on the design. The expansion of the application space drove the requirements of increased performance and function of the chips, making designs larger and more complex. As the semiconductor market grew and competition intensified, new techniques like power-gating, multi-core design, system-on-chip (SoC) design were introduced to squeeze out every bit of performance, optimize power, and meet aggressive time to market goals. Similarly, to contain test costs, there were advancements in test technology such as test compression codecs that dramatically reduced test time and data volume. The trend of integrating larger and more cores onto SoCs continued, resulting in increased test logic and test architecture complexity. Physically aware DFT became a standard practice to alleviate the PPA impact of test structures and hierarchical methodology with phased testing using static test pin-muxing became the go-to test strategy.
Now, semiconductor designs are going through another inflection point with applications such as AI and autonomous driving pushing performance demands further and needing the adoption of design methodologies like 3D-IC, chiplet-based designs, massively parallel design with thousands of replicated cores, and large tiled-based architecture to meet these requirements. These next generation designs require test technology innovation yet again and Synopsys is introducing groundbreaking streaming fabric and sequential compression technologies to address four key test requirements:
Although the existing test compression, static test pin-muxing, and current streaming methods have provided satisfactory test results for many designs to date, they face major challenges with the important test needs of upcoming designs. For short turn-around time, the test solutions should provide simplified DFT planning and implementation. The static pin-muxing often requires chip designers to go through a time consuming, iterative process to estimate codec input-output pins, distribute top-level pins to cores and define core test-groups, all during design development without the complete knowledge of their pattern count, power, and test time. Even with significant effort, this method often leads to fixed inefficient DFT, which makes managing test power challenging and results in sub-optimal test time due to its inability to maximize test pin usage throughout test. The fixed codec assignment also requires redesign when reusing a core, further slowing down the turn-around time. The current streaming solutions address several of these issues but still require the tedious process of determining effective codec configuration to reduce test volume and maintain streaming efficiency, which adds to development time and/or sub-optimal test data. Advanced compression technology is needed that can be implemented quickly, provides fast pattern generation, minimizes test data volume, and test cycles while maintaining test quality.
In terms of scalability, the test solution’s physical design should scale easily as advanced design scaling and integration techniques are adopted without compromising on test cost or development schedule. The pin-muxing technique results in long data paths to and from codecs that converge at the chip level, negatively impacting routing and congestion. The impact is further exacerbated in tiled-based designs with abutment because these designs typically need custom logic and additional routing in each core with this architecture. This poses a big challenge to chip designers when extending the design from hundreds to thousands of cores.
Finally, as the test expands into silicon lifecycle management (SLM) to meet device reliability goals, high bandwidth test over high-speed functional I/Os (HSIO), especially PCIe and USB, has emerged as an answer to the trend of decreasing test bandwidth with decreasing scan GPIOs and the need of streamlining test from manufacturing through system-level-test (SLT) to in-field test. This is achieved by high speed test and test patterns reuse over same HSIOs for all test phases. The test solution must be designed to leverage this capability and enhance test through silicon lifecycle. While pin-muxing architecture can be driven by HSIOs, its operating speed is constrained by its complex data path and timing constraints, hence, it is unable to take the full advantage of high test bandwidth available to reduce test time. The existing streaming solutions either have limited support to use functional HSIOs for test or can deploy this method only for manufacturing test using non-functional HSIOs.
Synopsys TestMAX DFT’s streaming fabric feature with sequential compression solution is a programmable, scalable, and high-speed test fabric with advanced compression engine that addresses the test time and DFT challenges of static pin-muxing architecture and current test codecs and streaming techniques. It also dramatically reduces the test cost and effort for silicon lifecycle test with complete support of high bandwidth test over HSIO.
Fig. 1: Synopsys streaming fabric with sequential compression.
The sequential compression uses seed-based input, multiple-input-shift-register (MISR) based with one-bit output and on-chip comparison, providing straightforward codec design, fast pattern generation, and high test volume compaction which reduces both test time and development time. As shown in figure 1, the streaming fabric has a uniform, bidirectional test bus that goes through each core and interfaces with sequential compression codecs through IEEE 1687 setup based programmable logic called sockets. The sockets standardize the core-to-test-bus interfacing for all cores in the design, which enables the designer to architect DFT swiftly and avoid iterations and difficult design decisions during the development process. The sockets can be programed after DFT and design completion based on codecs, core-groupings, and their test time and power requirements, making core level DFT implementation independent of chip level resources. This also allows a core to be easily reused in a new design containing streaming fabric by plugging it in and programming the core’s socket without any top or core level changes. This configurability of streaming fabric logic greatly simplifies DFT implementation and accelerates turnaround time.
Fig. 2: Test bandwidth distribution comparison between static pin-muxing and the streaming fabric.
The streaming fabric further reduces the test time by efficiently delivering the highly compacted test data to the cores. It automatically determines the test data bandwidth requirement for each core based on their test data and configures sockets to distribute test-bus bandwidth to codecs as optimally as possible such that the test pins utilization is maximized and the overall test time of the SoC is minimized, depicted in figure 2 above.
Another level of test time reduction comes from frequency scaling of the streaming fabric. GPIOs can typically operate at higher speeds than chip’s scan network and the architecture of the streaming fabric also allows for test data to flow at much higher speed than the codec and scan network in the cores. With socket’s bandwidth matching capability, a faster-narrower streaming fabric driven by a few top-level pins can drive multiple slower-wider codecs in parallel, further reducing test time. However, for many designs, the test-bus of streaming solutions can potentially operate even faster but is limited with GPIOs’ speed, which leaves test-bus bandwidth underutilized. The current streaming techniques propose using many GPIOs to utilize the remaining bandwidth by translating many slower GPIOs into a narrow faster test-bus through custom logic. This approach is not feasible for advanced designs that are witnessing reducing GPIOs and increasing HSIOs on the chip due to need of large off-chip data access.
Synopsys’ streaming fabric integrates seamlessly with high bandwidth HSIO-to-Scan/TAP test solution from Synopsys (shown in figure 3) that can deliver test data at significantly higher speeds using only a few HSIOs to a much wider streaming fabric test-bus and reduce test time dramatically compared to reduced GPIOs. Another advantage of test over HSIO is that it prevents the need to develop and maintain separate pattern sets for SLT and in-field test by reusing the manufacturing test patterns, providing a complete testing solution throughout the silicon lifecycle and accelerating time-to-market.
Fig. 3: High bandwidth test with functional high speed IO (HSIO). Few HSIO driving wide streaming fabric test-bus.
The regular and uniform architecture of the streaming fabric allows a physical design-friendly and scalable implementation for all designs including 3D-IC, chiplet-based designs, massively parallel design with thousands of replicated cores, and large abutted-tiles based designs. The standard interface at the core boundary and pipelined test-bus allows fabric to go from one core to next and eventually to the top-level pins for easy physical integration and timing closure for both abutted and non-abutted designs. The streaming fabric has a unique capability to deliver test-data on multiple hierarchical sub-branches simultaneously originating from the main test-bus and these sub-branches at can operate different speeds. Further, the designers can implement sub-branches of varying widths depending on each core’s location in the layout to achieve a balance between physical design and test time reduction. While the streaming fabric can broadcast the same test data to any number of identical cores on the chip to reduce test time substantially, the multi-branching architecture also provides designers with the flexibility to broadcast data to identical cores on a single branch with a smaller partition or on multiple branches serving multiple design partitions simultaneously to optimize the design’s PPA. As 3D-IC and chiplet-based designs are extensions of monolithic designs, the article “A Practical Approach To DFT For Large SoCs And AI Architectures, Part II” goes into the details of how the streaming fabric extends perfectly to provide an ideal test data delivery mechanism for those designs as well.
Modern applications are driving the paradigm shift in design scaling and integration methodologies and advanced test technology is required to meet the key requirements of these designs: short DFT turnaround time, minimized test cost, high test solution scalability, high bandwidth test, and test reuse through the silicon lifecycle. The streaming fabric with sequential compression and high bandwidth HSIO-to-Scan/TAP test technology from Synopsys not only provides test cost and turn-around time reduction for next generation devices, it also provides a flexible, scalable fabric architecture to optimize a design’s PPA with DFT and a complete solution for the entire silicon lifecycle. To learn more about how leading semiconductor design companies are deploying these technologies to achieve their quality and reliability goals, click here.
Leave a Reply