SPONSOR BLOG

Considering Semiconductor Implementation Aspects Early During Network-on-Chip Development

How implementation can fundamentally impact architecture.

February 22nd, 2023 - By: Frank Schirrmeister

As they say, while history may not repeat itself, it sure rhymes.

In 2015, I wrote the blog “Why Implementation Matters To System Design And Software.” At the time, I mused that while abstraction is essential in system design, it has limitations that users must consider. Critical decisions, such as those regarding power and performance, require more accuracy than can be feasibly abstracted. But it takes time to get to this increased accuracy. Power analysis driven by RTL-based emulation would provide more accurate power predictions when considering implementation effects seen when modeling semiconductor technology more accurately.

Fast forward eight years, and I am facing a situation that rhymes. This time it is about the topology development of Networks-on-Chips (NoCs) and how semiconductor implementation effects can fundamentally impact architecture.

Consider the high-level flow graph below. From a whiteboard, developers will separate the compute and peripheral building blocks from the interconnect. Users decide which protocols like AMBA 5 ACE-Lite, AXI, AHB, APB, OCP, or PIF the NoC often depend on the processors in the system on chip (SoC) to be developed.

Co-optimization of network-on-chip architecture and its layout

For the interconnect implementation, we at Arteris provide the tools and infrastructure for developing and optimizing the NoC topology, i.e., the number of switches, buffers, and safety features, all based on the interface, traffic, clock, and power domain specifications. At the level of “functional performance,” our and our partner’s tools for performance exploration allow assessments of memory and other architectural effects.

So it’s done from here. Right? Well, that would be nice.

From RTL to implementation

Place & Route (P&R) and the effects for semiconductor-technology-specific implementation want their fair say. Here are three areas in which the floor planning and sometimes even the actual layouts after P&R can heavily impact NoC topology development.

Firstly, while the interconnect between the significant building blocks often becomes the long pole in the tent called timing closure, its silicon real estate, more often than not, depends on the primary building blocks in the system. From a NoC-centric view of the world, we call them “blockages.” Designers ask the NoC to use what’s left.

Secondly, these blockages determine with the ports of the layout building blocks of compute and peripheral components the layout position of the critical connections communicating through the NoC. It’s like a puzzle to connect all of them using the NoC.

Thirdly, signal propagation becomes an issue now that we have the port positions and the area of silicon real estate that is made available for the NoC. Determining signal propagation has become very complicated, especially at smaller geometry nodes, as outlined in the following figure:

The transport delay is a function of many parameters, including the actual foundry, the routing stack used, the type of driving cell, the process used, the voltage, the temperature, and many more. Yes, developers still use rules of thumb that one will get about 1 mm of distance in about 500 ps and need to place pipeline registers appropriately. But still, it is complicated, and there is no one number for the transport delay. Gone are the days of my first chip design in which we manually calculated the metal routing layer capacity and decided to invert the clock tree right before tape-out. The chip worked on the first try. It used a .8 micron technology (ahem, 800nm) process, and I think we had only one metal layer. In contrast, in today’s smaller geometries, architects will be hard-pressed to determine which of the many layers to use for which type of signal to carry signals across.

The schedule impact – tool results are far from instant

Bottom line – all of this comes down to an actual information dilemma. The IP development tools know the architectural topology issues. The P&R tools know all the implementation effects. But as the right side of the first illustration shows, when delving down the various layers of abstraction, the turn-around times become longer and longer.

The following illustration shows a project schedule of a layout-aware but manual flow.

It can easily take two to five weeks in the front end to optimize the NoC architecturally. The team used an abstract floor plan to co-optimize the NoC topology. They manually developed the constraints to steer the digital implementation flow and P&R. They automatically exported the NoC from our environment to RTL and went through synthesis and P&R. But: They could not close timing, which they found out after layout runs that took 5-6 days. Returning to manually re-adjust the constraints to update the pipelining, and three times even the topology itself (updates to other blockages can require that, too), they had spent about ten weeks for the physical closure phase.

By now, it should be clear that it is critical to minimize the number of loops that lead back to the architecture phase, which is only possible when co-optimizing the IP with its layout.

Estimation based on abstraction to the rescue

There is probably no surprise that we have worked on the abovementioned issues.

Firstly, teams can consider early-stage and even late-stage floor plan information for IP development early. In “Why network-on-chip IP in SoC must be physically aware” my colleague Andy Nightingale recently illustrated this further. As of today, we have updated our IP development tools to read in floor plan information from chip images, Visio files, or LEF/DEF definitions.

Secondly, importing blockages with the positions of the ports to which the NoC needs to connect to allows automation of the placement of the main components of the NoC topology. Check. We implemented that. It beats the manual development of constraints in a project’s first phase.

Thirdly, abstracting technology information like gate delay & area, as well as wire delays, allows estimating the positioning of NoC components and insertion of pipelines significantly better and faster than any manual estimate would enable development teams to do. Does it rival or even attempt aspects of P&R? Absolutely not! We are partnering with all P&R vendors in this domain. These capabilities provide a significantly better starting point for digital implementation flows as they pick up the RTL generated by our IP development tools.

And best of all, as the following figure illustrates, this much better starting point allows for much-reduced schedules, as we are now allowing the co-optimization of NoC IP with the layout.

In addition, this physically aware flow further reduces the area and power consumption of the NoC by optimizing wiring and pipeline stages. We are simply avoiding overprovisioning, which is often the result of more manual flows.

The introduction of these capabilities is a significant step forward – check out FlexNoC 5 from Arteris. But please know, we are from done yet! Looking at the last illustration, one can quickly identify areas for further optimization, like even closer integrations with the information we can derive from our partner’s P&R engines. And the early phase of NoC topology development is also ripe for further optimization. Watch this space!

Frank Schirrmeister

(all posts)
Frank Schirrmeister is executive director, strategic programs, system solutions in Synopsys' System Design Group. He leads strategic activities across system software and hardware assisted development for industries like automotive, data center and 5G/6G communications, as well as for horizontals like AI/ML. Prior to Synopsys, Schirrmeister held various senior leadership positions at Arteris, Cadence Design Systems, Imperas, Chipvision and SICAN Microelectronics, focusing on product marketing and management, solutions, strategic ecosystem partner initiatives, and customer engagement. He holds an MSEE from the Technical University of Berlin and actively participates in cross-industry initiatives as chair of the Design Automation Conference's Engineering Tracks.

1 comments

Eric Esteve says:

February 24, 2023 at 8:46 am

Frank, as far as I remember, 0.8um was 2 metal layer technology. The big jump (moving from 1 to 2 metal layers) was for 2um… in 1983 at MHS, and it was far to be easy (took one year to manage proper metal lines crossing).

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Considering Semiconductor Implementation Aspects Early During Network-on-Chip Development

From RTL to implementation

The schedule impact – tool results are far from instant

Estimation based on abstraction to the rescue

Frank Schirrmeister

1 comments

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Recent Comments

About

Navigation

Connect With Us

Considering Semiconductor Implementation Aspects Early During Network-on-Chip Development

From RTL to implementation

The schedule impact – tool results are far from instant

Estimation based on abstraction to the rescue

Frank Schirrmeister

1 comments

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored