Accelerating Inference of Convolutional Neural Networks Using In-memory Computing


Abstract: "In-memory computing (IMC) is a non-von Neumann paradigm that has recently established itself as a promising approach for energy-efficient, high throughput hardware for deep learning applications. One prominent application of IMC is that of performing matrix-vector multiplication in (1) time complexity by mapping the synaptic weights of a neural-network layer to the devices of an ... » read more

Getting Better Edge Performance & Efficiency From Acceleration-Aware ML Model Design


The advent of machine learning techniques has benefited greatly from the use of acceleration technology such as GPUs, TPUs and FPGAs. Indeed, without the use of acceleration technology, it’s likely that machine learning would have remained in the province of academia and not had the impact that it is having in our world today. Clearly, machine learning has become an important tool for solving... » read more

Challenges Of Edge AI Inference


Bringing convolutional neural networks (CNNs) to your industry—whether it be medical imaging, robotics, or some other vision application entirely—has the potential to enable new functionalities and reduce the compute requirements for existing workloads. This is because a single CNN can replace more computationally expensive image processing, denoising, and object detection algorithms. Howev... » read more

Why Reconfigurability Is Essential For AI Edge Inference Throughput


For a neural network to run at its fastest, the underlying hardware must run efficiently on all layers. Through the inference of any CNN—whether it be based on an architecture such as YOLO, ResNet, or Inception—the workload regularly shifts from being bottlenecked by memory to being bottlenecked by compute resources. You can think of each convolutional layer as its own mini-workload, and so... » read more

Maximizing Edge AI Performance


Inference of convolutional neural network models is algorithmically straightforward, but to get the fastest performance for your application there are a few pitfalls to keep in mind when deploying. A number of factors make efficient inference difficult, which we will first step through before diving into specific solutions to address and resolve each. By the end of this article, you will be arm... » read more

Xilinx AI Engines And Their Applications


This white paper explores the architecture, applications, and benefits of using Xilinx's new AI Engine for compute intensive applications like 5G cellular and machine learning DNN/CNN. 5G requires between five to 10 times higher compute density when compared with prior generations; AI Engines have been optimized for DSP, meeting both the throughput and compute requirements to deliver the hig... » read more

Performance Metrics For Convolutional Neural Network Accelerators


Across the industry, there are few benchmarks that customers and potential end users can employ to evaluate an inference acceleration solution end-to-end. Early on in this space, the performance of an accelerator was measured as a single number: TOPs. However, the limitations of using a single number has been covered in detail in the past by previous blogs. Nevertheless, if the method of cal... » read more

It’s Eternal Spring For AI


The field of Artificial Intelligence (AI) has had many ups and downs largely due to unrealistic expectations created by everyone involved including researchers, sponsors, developers, and even consumers. The “reemergence” of AI has lot to do with recent developments in supporting technologies and fields such as sensors, computing at macro and micro scales, communication networks and progre... » read more

Optimizing Power And Performance For Machine Learning At The Edge


While machine learning (ML) algorithms are popular for running on enterprise Cloud systems for training neural networks, AI/ML chipsets for edge devices are growing at a triple digit rate, according to Tractica “Deep Learning Chipsets” (Figure 1). Edge devices include automobiles, drones, and mobile devices that are all employing AI/ML to provide valuable functionality. Figure 1: Marke... » read more

Implementing Low-Power Machine Learning In Smart IoT Applications


By Pieter van der Wolf and Dmitry Zakharov Increasingly, machine learning (ML) is being used to build devices with advanced functionalities. These devices apply machine learning technology that has been trained to recognize certain complex patterns from data captured by one or more sensors, such as voice commands captured by a microphone, and then performs an appropriate action. For example,... » read more

← Older posts Newer posts →