What goes on between the sensor and the data center.
Recently, I attended the AI HW Summit in Santa Clara and Autosens in Brussels. Artificial intelligence and machine learning (AI/ML) were critical themes for both events, albeit from different angles. While AI/ML as a buzzword is very popular these days in all its good and bad ways, in discussions with customers and prospects, it became clear that we need to be precise in defining what type of AI/ML we are talking about when discussion requirements of networks-on-chips (NoCs).
To discuss where the actual processing is happening, I found it helpful to use a chart that shows what is going on between sensors that create the data, the devices we all love and use, the networks transmitting the data, and the data centers where a lot of the “heavy” computing takes place.
From sensors to data centers – AI/ML happens everywhere.
Sensors are the starting point of the AI/ML pipeline, and they collect raw data from the environment, which can be anything from temperature readings to images. At Autosens, in the context of automotive, this was all about RGB and thermal cameras, radar, and lidar. On-chip AI processing within sensors is a burgeoning concept where basic data preprocessing happens. For instance, IoT sensors utilize lightweight ML models to filter or process data, reducing the load and the amount of raw data to be transmitted. This local processing helps mitigate latency and preserve bandwidth. As discussed in some panels at Autosens, the automotive design chain needs to make some tough decisions about where computing happens and how to distribute it between zones and central computing as EE architectures evolve.
Edge devices are typically mobile phones, tablets, or other portable gadgets closer to the data source. In my view, cars are yet another device, albeit pretty complex, with its own “sensor to data center on wheels” computing distribution. The execution of AI/ML models on edge devices is crucial for applications that require real-time processing and low latency, like augmented reality (AR) and autonomous vehicles that cannot rely on “always on” connections. These devices deploy models optimized for on-device execution, allowing for quicker responses and enhanced privacy, as data doesn’t always have to reach a central server.
Edge computing is an area where AI/ML may happen without the end user realizing it. The far edge is the infrastructure most distant from the cloud data center and closest to the users. It is suitable for applications requiring more computing resources and power than edge devices but also needs lower latency than cloud solutions. Examples might include advanced analytics models or inference models that are heavy for edge devices but are latency-sensitive, the industry seems to adopt the term “Edge AI” for the computing going on here. Notable examples include facial recognition and real-time traffic updates on semi-autonomous vehicles, connected devices, and smartphones.
Data centers and the cloud are the hubs of computing resources, providing unparalleled processing power and storage. They are ideal for training complex, resource-intensive AI/ML models and managing vast datasets. High-performance computing clusters in data centers can handle intricate tasks like training deep neural networks or running extensive simulations, which are not feasible on edge devices due to resource constraints. Generative AI originally resided here, often requiring unique acceleration, but we already see it moving to the device edge as “On-Device Generative AI,” as shown by Qualcomm.
When considering a comprehensive AI/ML ecosystem, layers of AI/ML are intricately connected, creating a seamless workflow. For example, sensors might collect data and perform initial processing before sending it to edge devices for real-time inference. More detailed analysis takes place at far or near edge computing resources for more detailed analysis, before the data reaches data centers for deep insights and model (re-)training.
As outlined above, AI/ML is happening everywhere, literally. However, as described, the resource requirements vary widely. NoCs play in three main areas here: (1) connecting the often very regular AI/ML subsystems, (2) de-risking the integration of all the various blocks on chips, and (3) connecting various silicon dies in a chiplet scenario (D2D) or various chips in a chip-to-chip (C2C) environment.
Networks-on-Chips (NoCs) as a critical enabler of AI/ML.
The first aspect – connecting AI/ML subsystems – is all about fast data movement, and for that, broad bit width, the ability to broadcast, and virtual channel functionality are critical. Some application domains are unique, as outlined in “Automotive AI Hardware: A New Breed.” In addition, the general bit-width requirements vary significantly between sensors, devices, and edges.
To enable the second aspect, connecting all the bits and pieces on a chip, it is all about the support of the various protocols – I discussed them last month in “Design Complexity In The Golden Age Of Semiconductors.” Tenstorrent’s Jim Keller described the customer concern regarding de-risking best in a recent joint press release regarding Arteris’ FlexNoC and Ncore technology: “The Arteris team and IP solved our on-chip network problems so we can focus on building our next-generation AI and RISC-V CPU products.”
Finally, the industry controversially discusses the connections between chiplets across all application domains. The physical interfaces with competing PHYs (XSR, BOW, OHBI, AIB, and UCIe) and their digital controllers are at the forefront of discussion. In the background, NoCs and “SuperNoCs” across multiple chiplets/chips must support the appropriate protocols. We are currently discussing Arm’s CHI C2C and other proposals. It will require the proverbial village of various companies to make the desired open chiplet ecosystem a reality.
AI/ML’s large universe of resource requirements makes it an ideal fuel for what we experience as a semiconductor renaissance today. NoCs will be a crucial enabler within the AI/ML clusters, connecting building blocks on-chip and connecting chiplets carrying AI/ML subsystems. Brave new future, here we come!
Leave a Reply