Shortening Network-on-Chip Development Schedules Using Physical Awareness

Better prediction of physical implementation effects leads to faster results for the final layout.

popularity

Taking physical design into account as early as possible has been a consideration of chip development teams for quite some time. Still, in interactions with customers and partners, 2022 marked a sharp uptick in concerns about whether a design that may be functionally correct can also be implemented using physical implementation flows. Given the intricacies and complexity of network-on-chip (NoC) architectures and the dependencies on the size and placement of other IP blocks, they are susceptible to physical effects.

What does it take to design a NoC?

Well, this is January 2023 so let’s ask ChatGPT.

To the prompt “Tell me how to develop an optimized network-on-chip for semiconductor design,” ChatGPT came back with the following:

  1. Define the requirements of the NoC: Before beginning the design process, it is important to clearly define the requirements of the NoC, including the number and types of components that will be connected, the expected data transfer rates, and the power and space constraints of the design.
  2. Choose a NoC architecture: There are several different NoC architectures to choose from, including mesh, ring, and hybrid architectures. Each has its own advantages and disadvantages, so it is important to choose the one that best meets the needs of the design.
  3. Select the routing algorithm: The routing algorithm determines how data is transmitted between components in the NoC. There are many different algorithms to choose from, including deterministic, adaptive, and hybrid algorithms. It is important to select the algorithm that provides the best balance of performance and efficiency for the particular requirements of the NoC.
  4. Design the physical layout of the NoC: The physical layout of the NoC determines the placement of the components and the routing paths between them. It is important to carefully consider the layout to minimize congestion, reduce power consumption, and ensure that the NoC can meet the required data transfer rates.
  5. Perform simulation and verification: Once the NoC has been designed, it is important to simulate and verify its performance to ensure that it meets the requirements and functions as intended. This may involve testing the NoC under various scenarios and conditions to ensure its reliability and robustness.

That left me quite impressed. One could argue that it has issues with parallelism – simulation and verification happen throughout the first four steps. The focus changes from performance aspects early on at the architectural level through more refined functional verification at the register-transfer level (RTL) through timing at the layout level. Still, ChatGPT got the basics quite right. Cool!

So, where is physical awareness needed?

When thinking about semiconductor components and their floor plans, the primary factors determining the size of a chip layout are usually related to the manufacturing process, cost, power consumption, heat dissipation, and chip yield. The NoC can take up a significant portion of the chip layout, particularly for chips with many functional blocks or high communication bandwidth requirements. However, designers can reduce its size by using techniques such as intelligent routing algorithms, optimizing the available resources, and reducing the overall area needed for the interconnect. Many examples of public chips are available on WikiChip, and some identify the portion the NoC occupies. Blocks like CPUs, GPUs, NPUs, and safety and security subsystems determine large parts of the design. Looking at the example reference like the Tesla FSD chip, it becomes clear that even early on, the design teams and architects responsible for the floorplan need to consider carefully where to place ports to optimize the interconnect.

Consider the example sketched out in the figure below.

The illustration shows the requirements from step 1 above as the left input – per specification – and the layout characteristics on the right, per implementation requirements. The result of steps 2 and 3 above is sketched out in the left graph showing a combination of the functional topology of the NoC overlaid over the floorplan that the design team came up with as an initial try. Note that the NoC topology needs to be simulated first based on the designer’s initial assumptions on how many switches, routers, etc., are required based on purely functional characteristics like the number of initiators and targets to be connected. Later in the flow, the timing based on placement plays a significant role and may lead to a reconsideration of the NoC topology, requiring a new drawing board and repeated architectural simulation.

You get the gist. This loop can happen multiple times, and run time and availability of physical implementation tools, as well as availability, determine each loop’s time and compute effort. The better one can predict the physical implementation effects, the faster one gets results ready for the final layout. Automation is critical here and needs to consider both top-down requirements from the specification and bottom-up needs determined by the shapes and sizes of the blocks that the NoC connects. Then multiply this by the number of NoCs modern complex chips may need – we have seen tens of NoCs on complex chips – and the potential savings using automation become intuitively very clear.

Where is all of this going?

As we enter 2023, physical awareness of NoC design will become even more critical, given that we have seen customer demand already growing in 2022 quite significantly. And given that even development of the functional NoC topology, despite significant progress in automation over the last decade, is often still a very iterative process, further automation, i.e., physically aware NoC topology synthesis, is on the horizon. Think layout tools providing .lef/.def to NoC automation tools and NoC placement forwarding optimized NoC layout suggestions back to them. One can imagine a time when AI/ML techniques that today already allow design teams to optimize the physical implementation of large building blocks like CPU and GPU clusters will extend to physically aware co-optimization of NoCs and compute blocks.

Until then, because we can, let’s ask ChatGPT one more time about NoC design, but this time Dr. Seuss style (I tried Shakespearean, too, but this was more fun).

Here you go:

To design a NoC that’s really hot
You must first consider what you’ve got
Processing elements, bandwidth, latency too
Power constraints, performance goals, all that jazz, oh what to do!

Choose a NoC architecture, that’s the first step
Bus-based, mesh, or maybe a tree rep
Routing algorithms, so many to pick
Deterministic, adaptive, congestion-aware, quick, quick, quick!

Interconnect structure, don’t be a sloth
Connections and channels, there must be both
Minimize latency, power, and meet the right bandwidth
Or else your NoC will be a complete ban.

Optimization techniques, they can’t be ignored
Buffer sizing, virtual channels, power management, oh Lord!
Validate the design, give it a test
Simulation, emulation, hardware, do your best!

Designing a NoC, it’s not just a breeze
But with careful planning, you’ll achieve success with ease

Oh my! Here’s to a great 2023!



Leave a Reply


(Note: This name will be displayed publicly)