Accelerating AI/ML Inferencing With GDDR6 DRAM


The origins of graphics double data rate (GDDR) memory can be traced to the rise of 3D gaming on PCs and consoles. The first graphics processing units (GPU) packed single data rate (SDR) and double data rate (DDR) DRAM – the same solution used for CPU main memory. As gaming evolved, the demand for higher frame rates at ever higher resolutions drove the need for a graphics-workload specific me... » read more

There’s More To Machine Learning Than CNNs


Neural networks – and convolutional neural networks (CNNs) in particular – have received an abundance of attention over the last few years, but they're not the only useful machine-learning structures. There are numerous other ways for machines to learn how to solve problems, and there is room for alternative machine-learning structures. “Neural networks can do all this really comple... » read more

Configuring AI Chips


Change is almost constant in AI systems. Vinay Mehta, technical product marketing manager at Flex Logix, talks about the need for flexible architectures to deal with continual modifications in algorithms, more complex convolutions, and unforeseen system interactions, as well as the ability to apply all of this over longer chip lifetimes. Related Dynamically Reconfiguring Logic A differ... » read more

Making Sense Of New Edge-Inference Architectures


New edge-inference machine-learning architectures have been arriving at an astounding rate over the last year. Making sense of them all is a challenge. To begin with, not all ML architectures are alike. One of the complicating factors in understanding the different machine-learning architectures is the nomenclature used to describe them. You’ll see terms like “sea-of-MACs,” “systolic... » read more

Edge-Inference Architectures Proliferate


First part of two parts. The second part will dive into basic architectural characteristics. The last year has seen a vast array of announcements of new machine-learning (ML) architectures for edge inference. Unburdened by the need to support training, but tasked with low latency, the devices exhibit extremely varied approaches to ML inference. “Architecture is changing both in the comp... » read more

Fast, Low-Power Inferencing


Power and performance are often thought of as opposing goals, opposite sides of the same coin if you will. A system can be run really fast, but it will burn a lot of power. Ease up on the accelerator and power consumption goes down, but so does performance. Optimizing for both power and performance is challenging. Inferencing algorithms for Convolutional Neural Networks (CNN) are compute int... » read more

Difficult Memory Choices In AI Systems


The number of memory choices and architectures is exploding, driven by the rapid evolution in AI and machine learning chips being designed for a wide range of very different end markets and systems. Models for some of these systems can range in size from 10 billion to 100 billion parameters, and they can vary greatly from one chip or application to the next. Neural network training and infer... » read more

Faster Inferencing At The Edge


Cheng Wang, senior vice president of engineering at Flex Logix, talks about inferencing at the edge, what are some of the main considerations in designing and choosing an inferencing chip, why programmability and modularity are important, and how hardware-software co-design with algorithms can improve performance and power. » read more

Neural Networks Without Matrix Math


The challenge of speeding up AI systems typically means adding more processing elements and pruning the algorithms, but those approaches aren't the only path forward. Almost all commercial machine learning applications depend on artificial neural networks, which are trained using large datasets with a back-propagation algorithm. The network first analyzes a training example, typically assign... » read more

AI Inference Acceleration


Geoff Tate, CEO of Flex Logix, talks about considerations in choosing an AI inference accelerator, how that fits in with other processing elements on a chip, what tradeoffs are involved with reducing latency, and what considerations are the most important. » read more

← Older posts Newer posts →