Virtualization: A Must-Have For Embedded AI In Automotive SoCs

Maximize the efficiency of AI accelerators and enable separation of tasks that require different ASIL levels.


Virtualization, the process of abstracting physical hardware by creating multiple virtual machines (VMs) with independent operating systems and tasks, has been in computing since the 1960s. Now, with the need to optimize the utilization of large AI and DSP blocks in automotive SoCs, along with the need for increased functional safety in autonomous driving, virtualization is coming to power- and area-sensitive embedded automotive systems.

There are two recent truths about artificial intelligence in automotive embedded systems. The first is that the urgency to add artificial intelligence into embedded designs continues to grow. This is driven by numerous factors – the rapid pace of AI research resulting in expanding use cases, the increased number of cameras and other sensors in automotive systems, the increases in image resolution of those cameras and sensors, the impressive capabilities for AI over previous solutions, and AI’s position in the hype cycle helping drive new research as well as business case enthusiasm. The second truth is that the neural networks required to implement AI are computation hungry algorithms requiring large amount of area for all the multiply-accumulate (MAC) units needed to reach L2+ and beyond performance levels. As neural processing units (NPU) become a larger part of automotive SoCs, system designers want to make sure they are optimizing that resource.

Virtualization allows expensive hardware to be shared between multiple VMs by abstracting physical hardware functionality into software. Without virtualization, resources can only be assigned to one operating system at a time, which risks wasting resources if one operating system cannot use all the resources of the physical hardware’s power.

Virtual machines and hypervisors

A neural network VM is a virtual representation, or emulation, of a neural network accelerator with its own neural network computation resources, memory, and interfaces. Each virtual machine runs a single OS; each OS runs multiple software processes; each process has one or multiple threads. Virtual machines are isolated from each other spatially (memory protection is in place so no accesses across virtual machines) and temporally (each virtual machine has a predictable execution time).

In figure 1, the bottom layer, labeled Processor HW, is the physical layer. Software called a hypervisor separates the virtual machine’s resources from the hardware. These small software layers enable multiple instances of virtual machines / operating systems to run alongside each other – each given its own slice of the underlying physical hardware. This prevents the VMs from interfering with each other.

For example, if two virtual machines are running on the same physical processor, as in figure 1, it is possible to have VM1 running a critical application while VM2 is not. The hypervisor software will guarantee that VM2 cannot access any memory of VM1 or affect the worst-case execution time of VM1. It will guarantee that if VM2 crashes, it will not impact VM1.

Fig. 1: Hypervisor software separates the physical hardware (processor HW) from virtual machines.

Virtualization and autonomous vehicles

As the demand for autonomous vehicles increases, so too does the need for ensuring the safety of those vehicles. The ISO 26262 standard defines functional safety (FuSa) requirements for detecting systematic and random (permanent and transient) faults for various Automotive Safety Integrity Levels (ASILs). Forward collision warning or engine control functions will require a higher ASIL level than, say, managing the car’s entertainment system. Different tasks can require different FuSa quality levels.

Virtualization can guarantee freedom of interference between applications and operating systems of different quality levels running on a shared resource. Separating tasks that require higher ASIL levels from tasks that don’t require an ASIL level makes the system more composable (modular), which simplifies the task of certifying your automotive system. Certification is required to verify your hardware has met the appropriate automotive standards. Each safety-critical VM and the hypervisor software will need to be certified. Non-safety critical VMs do not need to be certified, but they must be guaranteed not to break certified software.

Virtual machines include less downtime – critical in automotive – as multiple redundant virtual machines can be run alongside each other. An additional benefit of virtualization for autonomous vehicles is the ability to recover from safety errors. If VM1 is running a safety-critical task and VM2 is not, then a fault on VM1 could be ‘recovered’ if VM1 takes over the resources of VM2 to prioritize the safety-critical task.

Virtualization and NPUs

Fig. 2: Synopsys NPX6 NPU IP family supports virtualization in hardware and software.

The Synopsys ARC NPX6FS Neural Processing Unit (NPU) IP (figure 2) are AI accelerators used by host processors to deliver the most area- and power-efficient neural network performance. The NPX6FS family processors can scale from 1K MACs to 96K MACs and are designed for automotive safety, supporting ASIL B / ASIL D quality levels. The NPX6FS includes built-in safety mechanisms including dual-core lockstep on memory controllers and interconnect control structures, error-detecting code (EDC) on registers, error correction code memory (ECC) interconnects and memory ECC and watchdog timers.

As an accelerator, the host CPU dispatches VMs on the NPX, but tasks within a VM on multiple cores can synchronize without host interaction. A local hypervisor will run on the ARC NPX NPU. For spatial isolation, each VM on the ARC NPX will run in dedicated guest physical address space. The NPX has L1 and L2 memory controllers that include MMUs, which protect the controllers load/store accesses. The NPX will need to be integrated with an IOMMU to protect L2 controller DMA accesses.

The NPX6FS family can scale from performance levels as low as 1 TOPS to thousands of TOPS. These high levels of performance are important as ADAS application requirements are growing from hundreds of TOPS for L2 autonomy levels to thousands of TOPS needed for L3 and beyond. Let’s look at the NPX6-64KFS – a functional safety configuration with 64K or 65,536 INT MACs per cycle. This is a lot of parallel computing resources to execute vision neural networks like CNNs or transformers.

The NPX6-64K has sixteen 4K MAC cores for neural network computing (figure 3a). Each group of four cores (or each group of 16K MACs) is supported by a portion of the L2 closely-couple memory (SRAM). Each group of eight cores (or 32K MACS) is supported by an L2 controller. (Configurations less than 32K will only have one L2 controller.) IOMMUs are not part of the NPX6 FS and will need to be integrated into the system design.

Fig. 3a: Example of an NPX6=64KFS with no partitioning. External IOMMUs are integrated with the NPX6-64KFS which has sixteen 4K MAC cores for neural network computing.

While the NPX6-64KFS neural network processor is considered one engine (and is treated as such by the software development tools), the resources can be easily allocated to different virtual machines. We see in figure 3b a possible configuration for creating two partitions out of the NPX6-64KFS. In this instance, each partition has eight 4K MAC cores and are controlled by one L2 memory controller. One partition could be running a safety-critical VM and the other a non-safety-critical VM.

Fig. 3b: The NPX6-64KFS partitioning into two virtual machines.

We see in figure 3c that three partitions can be created out of the available physical resources of the NPX6-64KFS. In this case, the two smaller partitions would both be managed by on L2 controller. In this example, partition A could be running a safety-critical VM while partition B runs a non-safety critical VM while they both share the resources of one L2 controller. Partition C might be running a non-safety-critical VM using the second L2 controller.

Fig. 3c: The NPX6-64KFS partitioning into three virtual machines.

While it seems an easy task to divide a large NPX6 into different partitions and virtual machines by core, even the single-core NPX6-1KFS or NPX6-4KFS can use virtualization to create multiple virtual machines. Local hypervisors will handle the resource partitioning. Virtualization is a tool to create flexible ASIL systems and an important feature of the NXP6 FS family.

Virtualization has quickly become a must-have feature for embedded automotive solutions for designs targeting autonomous vehicles. Adding virtualization can help maximize the effectiveness of the large blocks of DSP and AI accelerators needed for L3 and beyond autonomy. Virtualization can also improve the system’s ability to recover from faults and to streamline the certification process by focusing only on the functional safety tasks. For future ADAS designs, it will be important to choose DSP and neural network processor IP that has the designed in hardware and software capabilities for virtualization.

Leave a Reply

(Note: This name will be displayed publicly)