OSDN – On-chip Software Defined Network

Not just another NoCoff.


You must be mumbling to yourself, “Oh no, not another NoC article! The term NoC is used so loosely in the industry and everybody seem to be claiming they have one, so what more is there to say?”

Fair enough, but please indulge me. Actually, there are some wannabe NoCs out there, but very few actually provide a full-fledged network. I submit, a real NoC should implement all the same key design elements that a wide area network does. Ideally, in fact, it should be a software defined network. Why?

First, let us try to take a step back to understand why the network-on-chip concept made its way into the on-chip interconnect. One of the crucial driving factors was the need for scalability in order to accommodate the rapid growth of SoCs both in terms of the number and the heterogeneity of the integrated IPs. To be very specific, increasing complexity of SoCs led to new challenges in terms of wiring congestion, physical timing closure, QoS/performance, traffic congestion, frequency of operation, inflexible rigid architecture, complex system dependencies, reliability, protocol agnostic, power management. Looking at the problem purely from an interconnect perspective and the need to manage data communication among multiple IPs, it is easy to see that this a familiar problem with a familiar solution. We need a true network solution, one that supports true packetization, deadlock avoidance, pipelining, virtual channels, multiple routes, physical awareness, and configurability.

True packetization (Reduced wiring congestion)
One of the immediate side effects of an increased number of IPs in a design is the large number of wires that need to be implemented. This impacts the physical design. One of the recent studies concluded that about 50% of performance and power in today’s chips is consumed by wires. It’s well known that wire delay does not scale as well as logic delay, and this adds to the challenge. With an AXI protocol, which requires five channels and a huge number of wires, it becomes almost impossible to route all the wires without getting pushback from the physical design team. Wire placement is not a trivial task and it’s almost impossible to achieve convergence. As soon as you think you have solved the wiring congestion in one part, it pops up at another location. There are network-on-chip companies that try to alleviate this problem by providing a packetized way to share the same set of physical wires between different channels. However, this is only a partial solution. Such solutions still require the user to have a completely separate and disjoint request-response network. These two networks do not share any physical resources and have their own set of switches (logic) and links (wires). One of the reasons this is done is to solve deadlock dependencies between channels. But this is a temporary solution in lieu of a systematic approach to deadlock avoidance, which I will cover in a later section. This translates into having dedicated wires for the WDATA and RDATA, and more overhead. A wire, is a wire, is a wire. PERIOD. If the same wire can carry AR and AW, why can’t it also carry R and W data? A real NoC should be designed to maximize the sharing of wires, and should not force the user to build two disjoint networks that increase not only the wire count, but also logic count within the interconnect.

Deadlock avoidance
Deadlocks can occur at different layers:

  1. Protocol Layer – e.g., dependencies between AXI channels
  2. Routing Layer – e.g., dependencies between routing paths
  3. Transport Layer – e.g., dependencies due to shared resources

One hopes there are no deadlocks in a design, but when one occurs, it cannot be denied, and it is too late! And to make matters worse, it is almost impossible to find root cause.

I have a 6-year-old daughter and I hope that, in a few years from now, she does not come back home and say, “Dad, this is my boyfriend, and I plan to spend the rest of life with him.” I can’t live without her, she can’t live without him, and I can’t live with him. So, what do we have here? A deadlock. But I know better than to just sit back and hope. So, I fill up her head with all kinds of information about how horrible boys can be and to stay away from them. My hope is that by preparing her and preventing the situation, I can avoid the problem and not have to deal with it later. But since this is real life, I also have a baseball bat behind the front door, just in case.

Einstein once said, “Intellectuals solve problems; geniuses prevent them.” Similarly, NoC technology does not have to solve deadlock problems, but rather needs to avoid them systematically by using the methods below:

  • Formal techniques and graph theory algorithms;
  • Correct-by-construction using implicit and explicit user-defined traffic dependencies;
  • Systematically scalable and robust by handling complex topologies and routing.

Credit-based protocol (Ease of pipelining for higher frequency)
A common method for speeding up processing is pipelining, whereby the execution of several instructions is interleaved on the same hardware. However, when pipelining is implemented at a later stage of the physical design cycle it can result in degradation of performance in terms of bandwidth. This is because the Req/Rdy handshake of standard protocols is not conducive to pipelining, and consequently, buffer overruns become more frequent causing degradation in bandwidth. Additionally, expensive register slices must be added to implement these pipelines. However, these problems can largely be avoided if the NoC supports credit-based flow control, which eases the loop dependency between the request and acknowledge by decoupling the two. What this means is that if the design needs to be pipelined, the credit-based NoC flow control makes it trivial to add it, even if the need arises late in the physical design stage.

Virtual channels (Traffic isolation by reduced head-of-line blocking)
Though packetization addresses the wiring situation, it does not solve the head-of-line blocking issue that can occur with multiple traffic flows of varying priorities and loading. The buffering available within the interconnect ends up being shared by all the traffic that flows through it. Real networks use the concept of virtual channels to isolate the logical channels from each other. Note this does not isolate them physically because they could still use the same wires (links), but use logically independent buffers. Traditional NoCs do not support virtual channels and hence lag in that respect. However, at least one company now offers a full range of virtual channel ready features which can be used at a per traffic flow granularity, giving the user the control for trading off between performance versus area. Virtual channels allow uses to isolate traffic as needed and to ensure that a single traffic flow does not hog the network bandwidth.

Multiple routes (Physical isolation)
While virtual channels isolate traffic logically, in some situations it is desirable to physically isolate traffic to the point that there is no gate or wire being shared. This applies not only between traffic from a different pair of master/slave, but also within a master/slave pair. Traditional NoC solutions have only one physical path between a master/slave pair. They would use the exact same path for all transactions because the route is defined by the slave address. But apart from performance, there are other reasons to physically isolate paths, one of them being reliability. In today’s world of finer geometries, systems must be able to deal with failures, and hence avoid certain paths when there is a problem. Having multiple routes enables a system to perform reliably under extreme physical conditions. Furthermore, giving users the control to decide if and when to switch routes, allows systems to handle different workloads efficiently and achieve better performance.

Physical awareness (Faster timing closure)
One of the greatest challenges of wiring congestion is timing closure. Problems with timing closure typically require multiple rounds of rework and iterations through the physical timing convergence loop. Unfortunately, most of these iterations occur in the later stages of the chip design and can result in unpredictable schedules and missed market opportunities. Many of the problems with timing closure can be avoided, though, if the NoC is physically aware in terms of the placement of the various bridges, IPs, routers, etc. A physically aware NoC would offer users the flexibility to place the IPs in the floorplan. It would also include allowances to let the physical locations be moved at a later stage, without the need to completely redo the interconnect. Augmenting all this with LEF/DEF ready collateral and details of the process technology make the on-chip network truly optimized for the SoC world. This approach enables designs that are timing clean not just at STA level but more importantly at a layout level.


An on-chip software defined network

Today’s SoC applications need interconnect solutions that are tailored and customized, but still configurable and optimized for dynamic workloads. Ideally, a NoC will provide three key attributes: design automation, a high degree of configurability, and runtime programmability.

With today’s vast number of IPs, use cases, and traffic requirements, it is improbable, if not impossible to build a NoC that meets both the spec and the schedule using manual methods. Not only is a manual effort time consuming, it is simply not feasible to optimize for all use cases by hand. At least one NoC platform offers an automated generation of the NoC by using architectural requirements from the system architect. It uses machine learning techniques to optimize for all uses cases and graph algorithms to ensure there are no deadlocks.

Every application has its own unique set of requirements, so the NoC platform should be highly configurable to accommodate this. The NoC platform should provide tools that allow the user to configure all features of the NoC during design to support the various SoC-level requirements while still maintaining the specific needs of IPs and the integrity of the interconnect.

Systems are usually built for typical use cases, but they need to be elastic enough to support specific heavy workloads when required. This drives the need for a robust set of runtime programmable features that system software can depend on to mold the character of the NoC based on the state of the system.

For more information on a physically aware NoC platform that supports advanced network capabilities, visit www.netspeedsystems.com.

Leave a Reply

(Note: This name will be displayed publicly)