Spreading Out The Cost At 3nm

Why advanced nodes make sense for more companies.

popularity

The current model for semiconductor scaling doesn’t add up. While it’s possible that markets will consolidate around a few basic designs, the likelihood is that no single SoC will sell in enough volume to compensate for the increased cost of design, equipment, mask sets and significantly more testing and inspection. In fact, even with slew of derivative chips, it may not be enough to tip the economic scale.

The cost per transistor has been increasing since finFETs were introduced, and with little to show for it. With power/performance improvements running in the range of 15% to 20% (usually even less) and the cost per transistor increasing at each new node, there has to be a very good reason for putting everything onto a single die. And this is why virtually all of the major foundries and EDA companies are backing advanced packaging as the next major step forward.

While there will still be a need for 3nm and 2nm logic, it likely will be integrated with other die or chiplets developed at different nodes. Presumably that will make 3nm and 2nm much more affordable for more vendors. Rather than buying an entire chip with integrated power and thermal management, the logic die can be developed at whatever density is required. And because these can be highly regular, repeatable structures, the volume can be spread out across multiple package designs rather than just one or two chips that end up in smart phones or servers.

Intel, AMD and Marvell are basically already taking this approach. They will customize designs for their customers based upon a number of pre-tested and pre-developed components, hooking them together with one or more different interconnect strategies. All of the big foundries and packaging houses are following suit, as well, although they all plan to use chiplets developed by multiple vendors rather than a proprietary selection.

This is long overdue for a number of reasons. As everyone knows, analog doesn’t scale. Second, the distance between two points may be shorter by adding another dimension into a device’s floorplanning. Third, it’s easier to cool most of these multi-chiplet/multi-die packaging approaches, which in turn can improve the operating lifetime of a chip and overall reliability throughout its lifetime.

The fourth and less obvious reason is that the dynamics of developing chips have shifted. Rather than integrating multiple functions onto a die, many of the new chips require redundant arrays of the same function. This is particularly true for AI/ML/DL applications, where the more MACs, the better. But moving other functions such as I/O and analog off chip is a welcome change because these designs have simply run out of room. Many designs are larger than a single reticle, requiring them to be stitched together.

Not every device needs this level of compute power, and not every device needs a full reticle-plus worth of multiply accumulates. But more and more devices do need some level of intelligence, as the economics of the edge open up vast opportunities for localized computing, and they need it with low power in an easily customizable format. And as that happens, being able to add pre-developed 3nmm or 2nm logic chips with standardized interfaces for packages will go a long way toward helping to pay for some of this very expensive equipment and testing that even the largest vendors no longer can afford by themselves.



5 comments

larry evans says:

All of these articles about using chiplets assumes that you can test to the level of a packaged device?

Erik Jan Marinissen says:

You wrote: “… it’s easier to cool most of these multi-chiplet/multi-die packaging approaches…”.
I am not convinced that the thermal aspects of multi-die or -chiplet stacks are always better than with a conventional single (2D) die. Especially if the multi-die/-chiplet architecture has multiple dies on top of each other (in a “2.5D” or 3D set-up), then the multiple active layers might heat up each other, and that can actually be a major challenge in the development of such multi-die stacks. For example: the ‘Wyoming’ test chip by ST-Ericsson consisting of a Wide-IO DRAM on top of a logic processor,, had major thermal problems, as presented by Stephane Lecomte at a Keynote Address at the ‘IEEE Intnl. Workshop on Testing Three-Dimensional Stacked ICs’ (3D-TEST) in Anaheim, CA in November 2012 (see http://www.pld.ttu.ee/3dtest/past_events/2012/) and also at CDNLive! EMEA in Munich, Germany.

Ed Sperling says:

Multi-physics floor-planning is essential with all advanced packaging, and this gets more difficult as different technology nodes and use cases are introduced into these packages.

Erik Jan Marinissen says:

Your article about 3nm and 2nm designs seems to take for granted that these advanced technologies will significantly increase the cost of testing. You wrote: “… the increased cost of design, equipment, mask sets and significantly more testing and inspection…” and later “…helping to pay for some of this very expensive equipment and testing that even the largest vendors no longer can afford by themselves.” However: is it really true that the 3nm and 2nm technology nodes will increase the cost of testing significantly?

There are probably not many DfT and test engineers who have actually worked on test generation or test application of chip designs in 3nm or 2nm. That is exactly what makes our paper entitled “Application of Cell-Aware Test on an Advanced 3nm CMOS Technology Library” (see http://doi.org/10.1109/ITC44170.2019.9000164), authored by imec, Cadence, and TU Eindhoven rather unique. Imec has developed 3nm and 2nm technologies and corresponding standard-cell libraries, for which several test chips have been designed, some of which have also been manufactured in one of imec’s wafer fabs in Leuven, Belgium — and tested of course!

Cell-aware test is probably currently the test method which is best capable of finding most defects in a digital standard-cell-based design, especially since cell-aware ATPG does an extra effort to cover also faults caused by cell-internal defects, which are identified based on the cell-layout.

In the ITC 2019 paper, cell-aware test was applied on circuit designs based on imec’s 3nm CMOS FinFET technology node called ‘iN5’. Also did the paper compare test results obtained on the iN5 library, with an (older) 45nm library named “GPDK045” from Cadence. To enable an ‘apples-to-apples’ comparison, we identified 49 cells which existed with identical logic function and same relative drive strength in both libraries. The paper concludes that in order to be able to run parasitic extraction (which is used to indicate where cell-internal open or short defects are likely to occur), there is definitely more understanding required regarding the significant architectural design that these advanced technology nodes and their cell libraries have. However, once PEX runs and all intra-cell defects are classified by analog simulation, it turns out that the differences between the 45nm and 3nm library cells w.r.t. number of potential defect locations, and consequently fault coverage and test pattern count is very comparable.

So, in conclusion, I do not think that single-digit (sub-10nm) advanced technology nodes by itself will cause more test data/time. Of course, if semiconductor companies are using these scaled-down technologies to cramp even more circuitry inside a single chip, that might increase the test costs (for digital tests typically determined by the test time and test data volume). However, the DfT community also has developed methodologies that keep the test time/data under control, despite the fact that chip designs have been growing in sheer size and complexity.

1. Modular (also called ‘core-based’ or ‘hierarchical’) testing has typically a quite large effect on test time and test data volume. In our paper “Test Data Volume Comparison of Monolithic Testing vs. Modular SOC Testing”, published in IEEE Design & Test of Computers, May/June 2009 (see http://doi.org/10.1109/MDT.2009.65), we showed that this reduced the test time and data volume with 88.9% for the AMD Athlon processor with 33 embedded cores.

2. Test data compression, available in different versions from all three major EDA suppliers (Cadence, Mentor, and Synopsys) is also capable saving another 10x to 100x of the test time/data.

So, this should all still be manageable. I think the test community has worked hard to taker care that test costs do not need to be the cost bottleneck for new chip designs in advanced technology nodes…

Ed Sperling says:

Erik, thanks for reading and responding. You raise some interesting points, and this is a complex issue. For one thing, it’s not clear what kinds of structures will be used in 3nm and 2nm. It may depend upon the application. Testing an SoC will be a lot different than a redundant processing element. In addition, the benefits of cramming everything onto an SoC have been dwindling for some time time, which is why the whole industry is supporting advanced packaging today, with all sorts of new options coming in the future. But testing a chiplet is still going to be faster than testing a whole chip because there are fewer structures to test and the chiplets are smaller. Spot checking based on statistics will certainly help, but to achieve full coverage at 2nm (which will be necessary in mission- and safety-critical applications) implies that we understand where the latent defects will be. That requires more inspection, measurement and testing than has been done in the past, and it adds to the total time and cost. In addition, test will need to be done throughout a chip’s lifetime, from lab all the way through to post-manufacturing. That will be much more manageable in a chiplet form than a massive SoC, and it will be easier to fix in a derivative chip using modular components.

Leave a Reply


(Note: This name will be displayed publicly)