Implementing Low-Power Machine Learning In Smart IoT Applications


By Pieter van der Wolf and Dmitry Zakharov Increasingly, machine learning (ML) is being used to build devices with advanced functionalities. These devices apply machine learning technology that has been trained to recognize certain complex patterns from data captured by one or more sensors, such as voice commands captured by a microphone, and then performs an appropriate action. For example,... » read more

Modeling AI Inference Performance


The metric in AI Inference that matters to customers is either throughput/$ for their model and/or throughput/watts for their model. One might assume throughput will correlate with TOPS, but you’d be wrong. Examine the table below: The Nvidia Tesla T4 gets 7.4 inferences/TOP, Xavier AGX 15 and InferX 1 34.5. And InferX X1 does it with 1/10th to 1/20th of the DRAM bandwidth of the ... » read more

Making Sense Of ML Metrics


Steve Roddy, vice president of products for Arm’s Machine Learning Group, talks with Semiconductor Engineering about what different metrics actually mean, and why they can vary by individual applications and use cases. » read more

Advantages Of BFloat16 For AI Inference


Essentially all AI training is done with 32-bit floating point. But doing AI inference with 32-bit floating point is expensive, power-hungry and slow. And quantizing models for 8-bit-integer, which is very fast and lowest power, is a major investment of money, scarce resources and time. Now BFloat16 (BF16) offers an attractive balance for many users. BFloat16 offers essentially t... » read more

VC Perspectives On An AI Summer


It’s been a busy summer for Applied Ventures. Our team has had many interactions in the startup and investing space, and added some new companies to our portfolio. I’ll be sharing highlights of these activities in a series of upcoming blogs, but first I’d like to reflect on current market developments in machine learning and how they are affecting VC investment patterns. Strategic inve... » read more

AI Inference Memory System Tradeoffs


When companies describe their AI inference chip they typically give TOPS but don’t talk about their memory system, which is equally important. What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC units) x... » read more

TOPS, Memory, Throughput And Inference Efficiency


Dozens of companies have or are developing IP and chips for Neural Network Inference. Almost every AI company gives TOPS but little other information. What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC... » read more

Deep Learning Models With MATLAB And Cortex-A


Today, I’ve teamed up with Ram Cherukuri of MathWorks to provide an overview of the MathWorks toolchain for machine learning (ML) and the deployment of embedded ML inference on Arm Cortex-A using the Arm Compute Library. MathWorks enables engineers to get started quickly and makes machine learning possible without having to become an expert. If you’re an algorithm engineer interested ... » read more

Do Large Batches Always Improve Neural Network Throughput?


Common benchmarks like ResNet-50 generally have much higher throughput with large batch sizes than with batch size =1. For example, the Nvidia Tesla T4 has 4x the throughput at batch=32 than when it is processing in batch=1 mode. Of course, larger batch sizes have a tradeoff: latency increases which may be undesirable in real-time applications. Why do larger batches increase throughput... » read more

Machine Learning Drives High-Level Synthesis Boom


High-level synthesis (HLS) is experiencing a new wave of popularity, driven by its ability to handle machine-learning matrices and iterative design efforts. The obvious advantage of HLS is the boost in productivity designers get from working in C, C++ and other high-level languages rather than RTL. The ability to design a layout that should work, and then easily modify it to test other confi... » read more

← Older posts Newer posts →