A New Dimension Of Complexity For IC Design

Static timing analysis becomes more difficult in 3D.

popularity

Full 3D designs involving logic-on-logic are still in the tire-kicking stage, but gaps in the tooling already are showing up.

This is especially evident with static timing analysis (STA), which is used to validate a design’s timing performance by checking all possible paths for timing violations. STA issues began popping up particularly with the introduction of hybrid bonding, a bumpless packaging approach in which chips or chiplets are connected face-to-face or face-to-back.

“When you build a chip or a chiplet, you close timing within it and then everything connects to an I/O pad through a bump,” said John Park, product management group director in the Custom IC & PCB Group at Cadence. “If that’s going to work off a common interface like AIB or HBI, or XSR, there will be a spec, and that’s a different type. There’s no real timing in that sort of use model. But when you start stacking logic on logic, where you’re not separated by a pad ring, or an I/O buffer ring, you still have to do flop-to-flop timing because you don’t have that intermediate layer of connecting to I/O buffers that work off a common interface bus.”

For true 3D die-to-die integration, STA is required for standard timing signoff just as it is for other designs.

“Two die connected through stacking is equivalent to two floorplan blocks connected within a single monolithic ASIC, and STA is a proven solution,” said Anthony Mastroianni, advanced IC packaging solutions architect and director at Siemens EDA. “For 2.5D or high-density fan-out, signal integrity analysis will be the primary mechanism for any high-speed die-to-die or chiplet-to-chiplet interfaces. For slower speed control or DFT die-to-die interfaces, STA is the preferred mechanism. It cannot accurately model very high-speed timing for signal speeds of more than 2 Gbps because STA does not model inductance. Instead, detailed RLC extraction and SPICE analysis is needed.”

“When dies are stacked, you need to perform static timing analysis of the system to make sure the critical signals that go in between die meet the timing,” said Kenneth Larsen, product marketing director for fusion compiler and 3D-IC compiler at Synopsys. “With some of the connectivity that we are building, there are some new resistance models — for example, long vias — that you want to take into account for timing purposes.”

This is new territory for EDA tools. With 2.5D and fan-out, the current state-of-the-art in die-to-die high-performance interfaces is primarily limited to HBM memories, where timing interfaces are very loose on control signals, and most customers today are not using STA, said Kevin Rinebold, advanced IC packaging technology specialist at Siemens EDA.

So far, there aren’t many logic-on-logic applications, and utilization of bumpless packaging has been limited. But as the benefits from feature shrinks dwindle with each new process node, chipmakers are looking at alternatives to boost performance and reduce power. For many applications, 2.5D and fan-out will be more than sufficient. Beyond that, 3D-IC architectures are needed, and testing is already underway.

“There are plenty of chiplet-based stacks, but that’s a different animal,” said Park. “That’s using a passive interconnect structure, and it works off of common interfaces. There, the specs do address passive timing, which means signal lengths must be matched. But now, in this packaging or board-level application, you’re just working with a passive interconnect. This means the trace that connects the two devices together could be lengthened to meet a timing rule, but you can’t insert a buffer like you would on a chip. So if I’m designing a chip, and I have to figure out timing problems, I can find the device layer, and then find a stronger buffer or a delay to meet timing, because you have the active front-end of line processing all the device layers.”

The rollout of a commercial chiplet ecosystem will only heighten the need for STA to handle very high-speed die-to-die interfaces.

“The most common 3D integration today is where you’re literally in the chip world and you’re integrating now in the Z direction,” said Rinebold. “The most common application today is taking the memory off a big chip that’s too far away and putting it directly above the processing logic on the chip in a third dimension.”

This is being done today with SRAM, and L1, L2 and L3 cache. But Park expects to see more logic-on-logic types of designs in the future. “That’s where being able to close timing across all these different corners, and having a timing engine that’s strong enough to do that, is going to be pretty important.”

How timing is impacted, what’s needed
When die are stacked, a number of issues can creep into the manufacturing process.

“Chips don’t work perfectly,” said Park. “There are always variances in process, temperature, etc., such that when you close timing, you have to look at the timing corners of process variation, thermal or temperature variation, and power variation. When you close timing on a single chip, it’s hard enough to cover all the corners. But as soon as you start adding multiple chips into the stack, the number of timing corners starts to grow pretty quickly to a point where you can’t use traditional static timing methods because the number of corners just grows too big and the problem becomes too hard to solve.”

This applies to digital chips. “When you’re designing a chip, you have to verify the flip flop-to-flip flop timing, and register-to-register timing. You have the ability to work with device layers to strengthen buffers, add slow-down signals, manipulate signals without shortening or lengthening the wire. Timing now goes not just horizontally. When you go to these 3D stacks, you’re timing flop-to-flop from the bottom logic macro to the top, just like you would do macro-to-macro timing on a 2D chip. You’re doing macro-to-macro timing now in the Z direction.”

This exponential number of timing corners is the result of adding dies, and the more dies that are added, the more complex the STA. This is more challenging with heterogeneous designs where various components are developed at different process nodes. Each of those die has different process, voltage, and temperature variations. If two die are added, this number doubles. With all this extra die, the number of corners becomes unmanageable.

Fig. 1: Packaging evolution toward full 3D integration. Source: Cadence

Fig. 1: Packaging evolution toward full 3D integration. Source: Cadence

The types of packaging that need timing (Fig. 1 above, far right) are referred to as macro stacking. “It’s hybrid bond, but two active dies are stacked on top of each other versus other types of packaging,” Park explained. “There are chips or chiplets that work off of a communication interface like PCIe, AIB, or others, and that’s a PCB style of timing where you’re wiggling the metal lines around to add length to them because there aren’t active devices that can be inserted into the path to slow down or speed up the signal. For bumpless integration, where there is no I/O buffer there, the connections are in the vertical direction instead of horizontally.”

In this 3D integration scenario, an important differentiator is that interposers and 3D chip stacks are still attached to BGA and LGA packages, which then are mounted to the PCB. These interposers are designed using an IC design tool, and need STA and traditional IC methodologies applied. But when it leaves the world of 3D integration and goes into packaging, then STA is no longer needed.

Another issue is that with an explosion of corners is that the problem is amplified when you mix different vendors, said Synopsys’ Larsen. “You need to analyze not only timing, but thermal, and EMI. All of the analysis is required, especially for 3D.”

Park agrees. “Thermal becomes a huge issue as soon as you go to this vertically stacking of multiple dies that are non-memory. Memory doesn’t generate a lot of heat, which is why memory and CMOS image sensors have led the way with 3D. But now, everyone wants to go to this world of 3D, starting with SRAM stacking on logic, but eventually logic-on-logic. And thermal is the number one concern of foundries, and of anyone doing this next-generation experimental 3D integration. People want to do things like route resource sharing, where I may have a lot of TSVs but not a lot of routing. Even though I’m connecting two devices, I may have to come up and use routing resources on one of the chips. In true 3D integration, you’re still designing a chip but now it’s like 3D chess. You’re designing a chip, and you used to just have a planar structure. But now you can start to build a skyscraper, and go up through it.”

That requires additional analysis. As Siemens EDA’s Mastroianni pointed out, it’s important to remember that STA is based on RC analysis and does not account for inductance. “At very high speeds and with traces long enough to have appreciable inductance, STA would be a poor choice as it’s no longer accurate. Additionally, STA uses .lib models to model timing based on loading and input data rate. For very high-speed interface such as SerDes, HBM, etc., STA’s .lib models are not accurate enough. Instead, you need SPICE or IBIS (I/O buffer info specification) models.”

Conclusion
Today’s STA tools account for 3D with updated algorithms, which intelligently reduce the number of corners but keep the accuracy up, with support for multiple dies. In some commercial tools, the crossings in between multiple dies rely on extracted models to capture potential effects, and then static timing is run on those models.

There are some new effects that need to be extracted out with the rollout of hybrid bonding and stacked die, and given the pace of change it’s not surprising that gaps have emerged. Tooling always is playing catch-up to new challenges, and the magnitude and speed of technology change has increased the urgency for new solutions. This, in turn, will drive the next round of innovation in EDA tools, and enable a whole new era of post-Moore’s Law scaling.

Related Stories
Setting Ground Rules For 3D-IC Designs
The few designs to reach silicon today are completely customized, with inconsistent tool support. That has to change for this packaging approach to succeed.
Challenges With Stacking Memory On Logic
Gaps in tools, more people involved, and increased customization complicate the 3D-IC design process.
Using Static Analysis For Functional Safety
Why this technology is suddenly so important.



Leave a Reply


(Note: This name will be displayed publicly)