Machine learning already plays a part in everyday life, but efficient inference will keep it moving forward.
Innovation comes in all forms in technology, from software and hardware to displays all the way to the human-machine interface. As devices and systems become more intelligent, the onus on humans to learn the machine’s ways is shifting. Until now, interaction with smart devices has largely relied on our ability to manipulate the machines; to learn their language to input and extract the information we want. But with the rise of Machine Learning (ML), the onus of learning is shifting to the devices and systems themselves, allowing them to learn from our preferences and anticipate our needs – all without being explicitly programmed.
ML already plays a significant role in our daily lives – when your smart TV recommends a show, for example, or your smartphone predicts what you’re going to type. We are already accustomed to interacting with smart phones and voice-activated assistants, but the not-too-distant future holds a vision in which all devices are innately intelligent – a future in which machines truly understand our words and respond intelligently to resolve day-to-day tasks.
But the power of ML will not be limited to personal assistants and predictive text. ML has quickly moved from identifying cat pictures to solving real-world problems significantly beyond the mobile market in areas such as healthcare, retail, automotive and servers. In fact, it will profoundly affect just about every area of our lives.
Today, however, the smartphone market is driving a large percentage of the innovation and addressing privacy, security, bandwidth and latency issues is a real priority. One way to do this is to move compute to the edge, allowing more inference to take place on the device itself. By reducing the amount of data that’s sent back and forth to the cloud, bandwidth usage is reduced, security and privacy risks are diminished, and latency issues are all but eradicated.
This, however, requires highly efficient inference – something the Arm ML processor is working hard to address.
Mobile performance
The ML processor is a brand-new design for the mobile and adjacent markets – such as smart cameras, AR/VR, drones, medical and consumer electronics – offering a performance of 4.6 TOP/s with an efficiency of 3 TOPs/W. Further compute and memory optimizations lead to a significant gain in performance for different networks.
The architecture consists of fixed-function engines, for the execution of convolution layers; and programmable layer engines, for executing non-convolution layers and implementing selected primitives and operators. The network control unit manages the overall execution and traversal of the network and DMA moves data in and out of the main memory. On-board memory allows central storage for weights and feature maps, reducing the traffic to external memory and, therefore, power.
Arm ML processor
Thanks to the presence of both fixed-function and programmable engines, the ML processor is extremely powerful, incredibly efficient and flexible enough to adapt to future challenges, providing raw performance along with the versatility to execute different neural networks effectively.
Key features include:
Flexible, scalable, futureproof
To tackle the challenges of multiple markets with a range of performance requirements – from a few GOPs for IoT to tens of TOPs for servers – the ML processor is based on a new, scalable architecture.
The architecture can be scaled down to approximately 2 GOPs of performance for IoT or embedded-level applications or scaled up to 150 TOPs of performance for ADAS, 5G or server-type applications. These multiple configurations can achieve many times the efficiency of existing solutions.
Compatible with existing Arm CPU, GPU and other IPs, providing a complete, heterogeneous system, the architecture will also be accessible through the popular ML frameworks, such as TensorFlow, TensorFlow Lite, Caffe and Caffe 2.
As more and more workloads move to ML, compute requirements will take on a wide variety of forms. Many ML use cases are already running on Arm, with our enhanced CPUs and GPUs providing a range of performance and efficiency levels. With the introduction of the Arm Machine Learning platform, we aim to extend that choice, providing a heterogeneous environment with the choice and flexibility required to meet each and every use case, enabling intelligent systems at the edge… and perhaps even the personal assistant I dream of.
Leave a Reply