Using a NoC can greatly simplify the development of an SoC, but this largely depends on who is developing the NoC.
By Michael Frank and Frank Schirrmeister
Excluding the simplest offerings, almost every modern system-on-chip (SoC) device will implement its on-chip communications utilizing a network-on-chip (NoC). Some people question whether it is necessary to use a NoC or whether a more basic approach would suffice.
An SoC is an integrated circuit (IC) that incorporates most or all components of a computer or other electronic system. The SoC is formed from multiple functional units called intellectual property (IP) blocks. Many of these blocks will be sourced from third-party vendors; the rest—the ones that provide the “secret sauce” that differentiates this SoC from competitive offerings—will be created in-house.
These IP blocks can include processor cores like microprocessor units (MPUs), graphics processing units (GPUs), and neural processing units (NPUs). In addition to various types of memory IP, other IP blocks may perform communication, utility, peripheral, and acceleration functions.
Each IP block is represented somewhere in the system’s memory space. The term “transaction” refers to the operation of writing or reading data bytes to or from addresses in the system’s memory space. For the SoC to perform its magic, the IP blocks must use transactions to “talk” to each other over some form of interconnect. The terms “initiator” and “target” refer to IP blocks that generate or respond to transactions.
The predominant interconnect mechanism used on SoCs in the 1990s was the bus. A highly simplified representation is illustrated in figure 1. Note that the line marked “bus” would comprise multiple wires implementing a data bus, an address bus, and associated control signals.
Fig. 1: Simple bus interconnect structure.
In many early SoC designs, there would be only a single initiator IP block in the form of a central processing unit (CPU). When the initiator placed an address on the address bus, all of the target IP blocks would see it, and one of them would say to itself, “that’s me.” When the initiator subsequently issued a read command or when it placed data on the data bus and issued a write command, the appropriate target would respond.
Some of the early SoCs might employ multiple initiators. For example, the CPU might be accompanied by a direct memory access (DMA) function, which could be used to quickly transfer large blocks of data between different areas of memory and peripherals. As the number of initiators in a design increases, it becomes necessary to implement some form of arbitration scheme that allows them to negotiate for control of the bus.
In the early 2000s, as SoC designs grew more complex—containing more and more IP blocks and employing multiple initiators—it became common to use crossbar switch-based interconnect architectures (figure 2). Once again, each line in this diagram represents a multi-wire bus comprising data, address and control signals.
Fig. 2: Simple crossbar switch interconnect structure.
In this case, any of the initiators can talk to any of the targets. The switches route transactions as they pass from initiator to target and back again, and multiple transactions can be “in flight” at any time. Each switch has the ability to buffer transactions, so if many arrive at the same time, it can decide which has the higher priority.
SoC designs continued to grow in size and complexity. SoCs circa the early 1990s might contain only a few tens of IP blocks, and the entire device might comprise only 20,000 to 50,000 logic gates and registers. Today, by comparison, an SoC can contain hundreds of IP blocks, where each includes hundreds of thousands, or sometimes millions, of logic gates and registers.
Over the same period, data buses grew in width from 8 to 16 to 32 to 64 bits and higher. In fact, the typical size of data transfers today is 64-byte (512-bit) cache lines, which quickly leads to routing congestion problems. Although the evolution of silicon chip processes has resulted in the shrinking of transistors by orders of magnitude, these problems are exacerbated because the widths of the wires on the chip have not decreased at the same rate.
To address these problems, today’s designers have adopted the concept of the network-on-chip. A simple NoC example is illustrated in figure 3. In this case, transactions involve packets of information being passed around. Each packet comprises a header, which reflects the destination address, and a body containing data, instructions, request type, etc.
Fig. 3: Simple network-on-chip interconnect architecture.
Multiple packets of information can be “in flight” at any particular time, and, once again, the switches have the ability to buffer and prioritize transactions. Since every initiator does not need to be able to communicate with every target, this can be reflected in the architecture, thereby further minimizing interconnect requirements.
From one perspective, using a NoC greatly simplifies the design of an SoC, but this largely depends on who is developing the NoC. The term “socket” refers to the physical interface (e.g., data width) and communications protocol between an IP block and the NoC. The SoC industry has defined and adopted several socket protocols (OCP, APB, AHB, AXI, STBus, DTL, etc.).
In addition to different data widths, IP blocks in the same design may be clocked at different frequencies. Since an SoC design can involve hundreds of IP blocks, many of which come from other third-party vendors, IPs may use different socket protocols. To accommodate this diversity, it may be necessary to convert transactions between the initiator and target sockets.
An in-house NoC to support multiple SoC projects requires configurability and flexibility in multiple dimensions. As a result, developing a NoC from the ground up can be as complicated and time-consuming as designing the rest of the SoC. Furthermore, the developers would now have two things to verify and debug—the NoC and the rest of the design.
The remedy is to use an off-the-shelf NoC solution like FlexNoC from Arteris. In this case, an intuitive interface allows designers to identify the IP blocks forming the architecture along with each block’s socket characteristics (width, protocol, frequency, etc.). The developers can also specify which initiators need to talk with which targets. At this point, it is effectively a “push-button” operation to generate the NoC.
Back to the question of when a design needs a network-on-chip. The intuitive answer may be that smaller designs may be exempt. But, in a recent discussion with Semico Research’s Rich Wawrzyniak, he confirmed what we already see in our customers: Even in smaller designs in the industrial and IoT domains, users often face 10s to 100s of IP blocks that have to be assembled and coordinated.
As shown in figure 4 below, Semico defines four categories of SoCs. Three of them – Basic SoCs with 100-200 discrete blocks and 1+ interconnects, Value Multicore SoCs with 200-275 blocks and 4+ complex interconnects, and Advanced Performance SoCs with >275 blocks and 5+ complex interconnects – are clear bulls-eye targets for the automation of network-on-chips. But even for commodity controllers, designers face 10 to 100 discrete blocks that need to interact.
Engineers tend to “just do the NoC themselves” for smaller designs. But more often than not, they soon arrive at the realization that they should have called Arteris to utilize NoC automation.
Fig. 4: Semico’s four complexity categories of systems-on-chip.
At the beginning of this column, the question “When does my SoC design need a NoC?” was posed. The simple answer is that today’s increasingly sophisticated SoC designs always need a NoC to optimally achieve routing and performance goals. The easiest way to implement a state-of-the-art NoC is to use FlexNoC from Arteris.
Frank Schirrmeister is vice president of solutions and business development at Arteris IP.
Leave a Reply