Power consumption, latency, and privacy concerns for NPUs included in application processors.
Always-sensing cameras are a relatively new method for users to interact with their smartphones, home appliances, and other consumer devices. Like always-listening audio-based Siri and Alexa, always-sensing cameras enable a seamless, more natural user experience. Through continuous sampling and analyzing visual data, always-sensing enables use cases such as:
Always-sensing cameras require specialized processing because of the richness and complexity of the information they gather and the obvious privacy concerns with an always-on camera. Since ISPs and DSPs can’t deliver the necessary performance in a power-, area-, or privacy-friendly way, OEMs are turning to specialized AI processing subsystems, which include the camera, processor, and associated memory.
Despite continued improvements in energy storage density, battery-powered devices always demand greater power efficiency with each successive generation. Even wall-powered devices face scrutiny, with consumers, businesses, and governments demanding lower power consumption. To be successful with users, the always-sensing subsystem must consume the least amount of power and area possible. With area, smaller is better, both from a design perspective and as a function of the cost of silicon. Area needs to be considered as a function of the subsystem, not just the processor, but also the required memory. Since privacy and data security are essential, always-sensing systems must be architected to securely capture and process data from the camera without storing or exposing it. Additionally, the always-sensing subsystem must work hand in hand with the device’s security protocols to best protect user data.
So how can always-sensing be enabled in a power, latency, and privacy-friendly method?
AI processing is best done via NPUs (Neural Processing Units). While many existing Application Processors (APs) include NPUs, those NPUs aren’t the ideal vehicle for always-sensing due to their relatively high power consumption, latency, and privacy concerns.
Fig. 1: Simplified Application Processor.
A typical AP is a mix of heterogeneous computing cores, including CPUs, ISPs, GPU/DSPs, and general-purpose NPUs, as shown in figure 1. Each processor in the AP is focused on specific compute types and processing loads. For example, a typical NPU in an AP might provide 5-10 TOPS of performance, with a typical power consumption of around 4 TOPS/W.
As discussed, always-sensing processing requires the lowest, most efficient power consumption possible. Because of that, always-sensing neural networks are specifically created to handle small, highly specialized jobs, typically measured in GOPS (Giga Operations Per Second)—GOPS being one-thousandth of TOPS (Tera Operations Per Second). While the NPU in an existing AP is fully capable of GOPS-level AI processing, it’s not the right choice for various reasons. First, since the NPU is not designed to process always-sensing networks in a power-efficient manner, using it would reduce battery life. Second, low latency is essential for instant user interaction responses. Since the AP-based NPU may be busy with other tasks, other processes can increase latency and negatively impact the user experience. Finally, privacy concerns essentially preclude using the application processor. This is because the always-sensing camera data should be isolated from the rest of the system and not stored within or transmitted off the device. This is necessary to limit data exposure and to reduce the chances of a nefarious party accessing the data.
The solution, then, is a dedicated subsystem implemented with a special-purpose NPU designed to process always-sensing networks with an absolute minimum of area, power, and latency: the Expedera LittleNPU. While the LittleNPU offers a comparatively modest 500 GOPS to 1 TOPS capability, this level of processing is ideally matched to always-sensing needs. It uses Expedera’s 18 TOPS/W Origin architecture, ensuring power consumption is at the absolute minimum, preserving battery life. We estimate that most always-sensing networks may require only 10-20mW active power consumption from the NPU. LittleNPUs are also ideal from a latency perspective; because they are designed solely to process the always-sensing network(s), there is no contention with other processes. Finally, always-sensing data is kept within the boundary of the LittleNPU subsystem, providing better user privacy and a smaller footprint for security implementation. To note: this approach does not absolutely guarantee security and appropriate device-level hardware and software security measures should be enacted to best protect this information.
Expedera’s Origin E1 LittleNPU IP can be implemented in multiple ways, shown in figure 2 below. Initially, many OEMs choose to implement a near-sensor processing architecture, using a dedicated chip combining the LittleNPU with the camera in a single subsystem. Others may go the co-processor path, where an NPU is combined with an Image Signal Processor (ISP) on a discreet chip containing all necessary memory. Ultimately, Expedera foresees that the LittleNPU will be fully integrated into the AP, where the always-sensing subsystem occupies a dedicated area with its own power island, memory, and security features.
Fig. 2: Always-Sensing Subsystem Architectures: The LittleNPU is shown in red, and the camera sensor is shown as a black icon.
Always-sensing is a logical evolution of smartphones, security cameras, home appliances, and other consumer devices. However, to be successful, it must be architected to preserve battery life and privacy while delivering a compelling user experience.
Leave a Reply