The role of the network-on-chip in ensuring total system safety.
The MPSoC Forum, sponsored by IEEE and other industry associations, hosts an annual conference in beautiful places around the planet. It is dedicated to showcasing renowned academic and industry experts in multicore and multiprocessor architectures. The goal is to explore trends in system-on-chip (SoC) hardware and software architectures and applications. An additional purpose is to consider the next generation of ideas for enhancing performance, improving ease of use, and reducing costs. I like to think of this as the “Davos for chips.” The conference features excellent speakers, wonderful locations, exquisite food, and fine wine.
To celebrate its 20th anniversary, this group just published a two-volume book set that will be released on May 11. These books are under the Wiley imprint and also available for pre-order on Amazon. The first volume is on architectures, the second on applications. The first volume is divided into sections on processor architectures, memory architectures and technologies, and interconnect and interfaces. K. Charles Janac, president and CEO of Arteris IP, authored the first chapter in that third section on network-on-chip (NoC) architecture and how it has enabled MPSoCs.
The chapter starts with the evolution from buses to crossbars to NoCs. Next is a useful overview of a typical approach to architecting and configuring a NoC. As the most configurable intellectual property (IP) in an SoC, getting the design to an optimal solution requires careful planning and refinement. The design evolves, not just the logic but also the topology.
Think of planning traffic lights for a small town. Engineers design the timing, the crosswalks, and related items. There are bigger-scale problems to plan the lights for a larger city, but engineers can still systematically solve planning and implementation issues. The same analogy applies to a large SoC NoC. Engineers plan and implement the logical and quality of service (QoS) perspective as well as the physical layout perspective.
By the way, this book is a technical review, not a marketing pitch. Charlie is quite open that while NoCs share some concepts with “regular” communications networks, the analogy cannot be stretched too far. NoC design is still very much an activity for semiconductor designers, not general network designers.
NoCs carry across some characteristics of networks, particularly in managing QoS and in-network debug. For QoS, performance can be managed, not only statically but also dynamically, by the NoC itself or by software through control and status registers. Debug is another obvious service since the network-on-chip sees all data in transit. Probes to inspect data and monitor performance are generally available in NoC architectures and can output traces to industry-standard debug foundations like CoreSight and MIPI.
Support for functional safety and security within the NoC is essential at the same level expected for any other IP. Support is needed for parity and error-code correcting (ECC) checks, duplicated logic or Triple Modular Redundancy (TMR), and even logic built-in self-test (LBIST) to validate network correctness in-flight. The NoC also plays a much larger role than other IPs in total system safety because it carries and “sees” all the dataflow on the chip. There are now SoC architectures where ISO 26262 ASIL D designs based around a mix of IPs with different levels of ASIL support are now built around an ASIL D controller which will monitor, through the NoC, all functions on the chip while in-flight. Each function can be isolated, checked through LBIST, brought back online or left disconnected if malfunctioning, at which point the safety controller can communicate back to a central controller for mitigation.
In security, the NoC can provide firewalls, similar to network firewalls, blocking or poisoning data in transit if it does not meet appropriate requirements.
Cache coherence is not a topic just for CPU clusters. Many more IPs now need to share data with that cluster in a computer vision and AI world. Now GPU clusters and AI accelerator clusters are spread out over large SoCs. The NoC must also support cache coherence for IPs with coherent interfaces and IPs that do not have such interfaces. To support coherence management and a wide variety of protocol options, a coherent NoC uses a different protocol from the standard NoC, supporting a semantic superset of those other standards.
As the leading interconnect technology for large SoCs, demands on the NoC are not standing still. As floor plans become larger and more complex, customers want topology synthesis and more floor plan awareness. Also, clients are demanding more automation in tuning the physical implementation of the NoC to floor plan constraints.
There is a big push in safety towards the ASIL D “fail-operational” mechanisms. Customers actively work to go beyond error detection to error prediction, for example, checking transistor aging monitors.
Absent a significant surge in process help, the continuing thirst for performance and intelligence must now be slaked by new architectures. This is seen especially in AI with grids and rings of processing elements, and the ability to broadcast weights in one clock tick or aggregate reads similarly. All these needs are creating demands for innovative NoC architectures and flexibility.
Also driven by AI and demand to support scalable architectures, cache coherence is going hierarchical. A single domain in the accelerator might grow to 2 or 4 domains in higher-end products connecting coherently as needed to the CPU cluster domain. Multi-die structures are already demanding inter-chiplet cache coherency using CXL, CCIX or other methods. Can multi-die NoCs be far behind?
Finally, what more can be done to monitor and protect the SoC from within the NoC? The network has a unique view of almost everything happening in the SoC. There should be more advancement to leverage this level of insight and control.
These books are worth every dollar for anyone wanting to be current with the state of the art in MPSoCs. Check it out!
Leave a Reply