To keep up with growing vision processing requirements, new solutions are needed.
Imagine these futuristic scenarios: you hold your phone up to your face, and it automatically recognizes you and unlocks, so you can access content. A sensor at your front door recognizes that you are not an intruder, no matter what the wind has done to your hair or whether your face is obscured by a scarf. How about an autonomous car that recognizes your driving style, so not only can you turn on the “self-drive” mode, but you feel comfortable doing so, because it drives just like you do? Or a camera that can recognize what you’re focusing on, despite adverse lighting conditions, so the camera can “see” as well as you can with your human eyes, and take a photograph to reflect this capability?
Of course, these are not outlandish possibilities that might happen in some mythical future. These applications are here already, in various forms. And required in these scenarios—and countless others—are vision sensors combined with AI decision-making.
Current AI and vision application challenges
The image sensor market is growing, and new opportunities are emerging in the mobile, AR/VR headset, surveillance and automotive markets as the basic functionality of image sensors continues to be enhanced. Increasingly, applications in these markets require a mix of vision and AI to perform a wide range of advanced capabilities.
With the growth and sophistication of these applications also come new challenges, including the need to process at higher resolutions, using algorithms that process more frames, and achieving higher performance using lower power.
Mobile
In this market, more use of video is coming. End users are creating more and longer videos that will add intelligent high dynamic range (HDR), bokeh and other effects, face detection, and object detection—thus requiring greater computational capacity. With the introduction of ARKit from Apple and ARCore from Google on the Android platform, more and more AR/VR-based applications are being written. All these applications can use vision-based simultaneous localization and mapping (SLAM) with AI on the mobile platform, providing a unique user experience. Even simple applications will require increased computational capacity while utilizing SLAM.
AR/VR Headsets
As this market comes of age, improving the user experience is becoming more critical. The need for lower latency is driving higher computational requirements for vision and AI. These headsets require on-device AI for object detection and recognition, gesture recognition, and eye tracking.
Surveillance
Most of these cameras run at 1080p resolution, but future cameras will need up to 4k resolution at 60 frames per second (FPS). These cameras are becoming smarter and are adding on-device AI for stranger and anomaly recognition. Surveillance cameras for commercial applications use on-device AI for tagging the video with identified people and distinguishing friend from foe in real time, thus reducing the latency of performing these tasks in the cloud. This drives the need for higher performance in SoCs that are used for surveillance.
Automotive
High-end cars have the potential for using up to 50 cameras each by 2025 [1]. With 360-degree sensors, driver alertness sensors, and the processing required to power autonomous driving, demand for vision and AI processing is skyrocketing. The automotive market demands reliability and requires operation over a wide temperature range, where the lifetime of an automobile can be 20 years or longer. Imaging functionality will become very complex within the automotive ecosystem over the next 10 to 15 years, due to the wide range of data and variables related to avoiding accidents.
A new solution is needed
The processing power and speed required for today’s neural networks have barely kept up with the application requirements, particularly in the field of vision. Until recently, neural network inferencing predominantly has been performed in the cloud, but this is problematic for the growing number of edge applications that require lower latency. As a result, the trend is moving toward on-device AI, and DSPs are proving an increasingly popular solution.
A successful vision and AI DSP solution must be:
The Cadence Tensilica Vision Q6 DSP, the latest DSP for embedded vision and AI built on a new, faster processor architecture, meets all of these requirements.
Built on a deeper 13-stage processor pipeline and system architecture designed for use with large local memories, the Vision Q6 DSP achieves 1.5GHz peak frequency and 1GHz typical frequency at 16nm, in the same floorplan area as its highly successful predecessor, the Vision P6 DSP. The Vision Q6 also offers a complete AI software platform to meet market needs.
It’s an exciting time for embedded vision and AI. Image processing requirements are increasing, with new experiences, higher resolution and sensors. Meanwhile, on-device AI experiences are also growing in number and complexity, while the underlying neural networks are continuing to evolve at a rapid pace. To meet these myriad requirements, the market needs a DSP capable of meeting increasing vision and AI demands and power-efficiency needs. Faster and with less power than previous solutions, the Vision Q6 DSP is an ideal solution for this rapidly changing market.
Leave a Reply