How To Reduce Timing Closure Headaches

Why physically aware NoC IP is critical in complex designs.

popularity

As chips have become more complex, timing closure has provided some of the most vexing challenges facing design engineers today. This step requires an increasing amount of time to complete and adds significantly to design costs and back-end schedule risks.

Wire delay dominates transistor switching delay

Building high-performance modern CPUs involves pipelining to achieve high frequencies. In CPUs, logic gates tend to be very close to each other and timing issues are dominated by transistor switching delay, which is much smaller than wire propagation delay.

SoC on-chip interconnects also require pipelining to close timing. However, designers are faced with a different challenge in this area: Where to pipeline within the NoC network topology?

From a macro standpoint, the number and locations of pipelines stages depends on how the interconnect topology maps to the placement constraints created by the chip floorplan. This floorplan information is the key to implementing spatial analysis because it includes the locations of IP-to-interconnect connections and restrictions that the interconnect must be routed around. At the micro level, pipeline placement is defined by temporal analysis that is usually determined by how far data can propagate along long wires within a single clock cycle.

kurt1
New pipelines inserted by FlexNoC Physical accelerate the SoC timing closure process.

Solving this problem at an SoC-wide level grows even more challenging because it is totally different for every chip. As a consequence, interconnect providers must not only supply world-class IP, but they must also deliver the means through hardware IP and configuration software to help users close timing for their own custom interconnect configuration.

Through our many years of customer interaction, it became apparent to us as NoC IP providers that SoC design teams need more help in achieving timing closure. What we developed was a physically-aware NoC IP that automates many of the steps in the pipeline methodology in use by our users.

Focus on the longest wires
The rationale underpinning the technical solution is simple: Since the NoC IP has the longest wires in the SoC, reducing the time to achieve timing closure requires automation to optimally configure pipelines in multi-cycle links.

The interconnect links IP interfaces through intermediate elements like switches, routers and buffers. The locations of these elements determines the length of wire links within the interconnect. The location of each of these elements and links is determined in the back-end synthesis place and route stage, so traditional interconnect pipelining is usually implemented late in the design schedule, based on post-route timing. However, this requires a long iterative loop consuming days or weeks of computer runtime, and which must be restarted for any significant floorplan change.

To solve this problem, physically-aware NoC IP provides fast, high-level estimation of interconnect element locations and delays during the earlier, front-end design phase of a project using existing versions of floorplans as the starting point. This enables automatic pipeline insertion and “hinting” (or “forward annotation”, if you like) of NoC IP element locations to back-end place and route tools, thereby providing an optimized design to the back-end team that is more likely to achieve first-pass place-and-route success.

kurt2
IP block locations and locations of IP-to-interconnect connections constrain where the interconnect IP can be placed. The white spaces between the IP block shows the available die area for interconnect placement. Small squares denote IP-to-interconnect connection locations.

It’s important to note that this is not a generic EDA or tool solution: This type of optimization can only be accomplished with knowledge of the design team’s intent and of the IP characteristics at the system level. This detailed data is available in the configuration data and IP structure of the NoC interconnect, and is used to provide actionable data to automatically configure itself.

Breaking down the wall between front-end and back-end teams
In the recent past, front-end design teams rarely communicated with back-end design teams, and vice-versa. The key to a physically aware NoC is to use front-end technology to accelerate back-end timing closure and layout. This requires front- and back-end team members to coordinate early in the project schedule.

The benefit is that when NoC IP is physically-aware it can adapt to early-stage and production floorplans and IP placement locations through automated timing closure, reducing overall schedule time.

Designers should only evaluate physically aware interconnect IP if they wish to achieve the following:

• Reduction or elimination of multiple iterations between place and route runs;
• Elimination of trial-and-error timing closure through automated pipeline insertion;
• Reduction in over-design and over engineering;
• Optimized system Quality-of-Results (QoR) – balancing per-link bandwidth and latency requirements.

The timing closure double death spiral
NoC technology has always helped reduce or eliminate wire routing congestion, which helps avoid timing closure issues in the backend. But as chips have become more complex, it became clear to us that more technology has to be introduced to keep pace with SoC scaling.

kurt3

Here’s how wire routing congestion causes physical implementation problems leading to failed timing closure: Routing congestion makes timing closure or “convergence,” more difficult because as the less resistive upper metal layers are quickly consumed, the EDA routing tools attempt to route long, timing-critical paths on more resistive middle and lower metal layers. The decreased cross-sectional area of the wires in the middle and lower metal layers leads to increased resistance which causes greater wire delays over longer distances.

Furthermore, as resistance in wires increases, timing closure and timing convergence issues become more likely because of wire delays, Vdroop, poor clock skew/slew and delays in global clock distribution.

As portions of the die become congested with many wires, the EDA tools will route wires around these congested areas, increasing wire length and adding signal delay proportional to the increase in length of the wire. These signal delays contribute to the time it takes to achieve timing closure.

Basically, we have an unpredictable feed-forward system where timing closure becomes more difficult to achieve as more connections are created between IP blocks. Wire routing congestion begets more congestion.

Get physical to close timing predictably
To help alleviate this problem, we provide technology that finds potential timing closure bottlenecks created in the front end and rectifies them before they create issues in the back end. Yes, it takes a little more upfront coordination to get back-end and front-end teams sharing information earlier in the design process. But the huge benefit is the reduced number of lengthy iterations between these teams at the end of the design project.

And who doesn’t want predictable timing closure at the end of their SoC design project?



Leave a Reply


(Note: This name will be displayed publicly)