Shift Register-In-Memory Architecture


A new technical paper titled "Toward Single-Cell Multiple-Strategy Processing Shift Register Powered by Phase-Change Memory Materials" was published by researchers at Singapore University of Technology and Design and University of Cambridge. Abstract "Modern innovations are built on the foundation of computers. Compared to von Neumann architectures having separate storage and processing uni... » read more

Review of Tools & Techniques for DL Edge Inference


A new technical paper titled "Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review" was published in "Proceedings of the IEEE" by researchers at University of Missouri and Texas Tech University. Abstract: Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying thes... » read more

Heterogeneous Multi-Core HW Architectures With Fine-Grained Scheduling of Layer-Fused DNNs


A technical paper titled "Towards Heterogeneous Multi-core Accelerators Exploiting Fine-grained Scheduling of Layer-Fused Deep Neural Networks" was published by researchers at KU Leuven and TU Munich. Abstract "To keep up with the ever-growing performance demand of neural networks, specialized hardware (HW) accelerators are shifting towards multi-core and chiplet architectures. So far, thes... » read more

Choosing The Correct High-Bandwidth Memory


The number of options for how to build high-performance chips is growing, but the choices for attached memory have barely budged. To achieve maximum performance in automotive, consumer, and hyperscale computing, the choices come down to one or more flavors of DRAM, and the biggest tradeoff is cost versus speed. DRAM remains an essential component in any of these architectures, despite years ... » read more

Choosing The Right Memory At The Edge


As the amount of data produced by sensors in cars and phones continues to grow, more of that data needs to be processed locally. It takes too much time and power to send it all to the cloud. But choosing the right memory for a particular application requires a series of tradeoffs involving cost, bandwidth, power, which can vary greatly by device, application, and even the data itself. Frank Fer... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

Complex Tradeoffs In Inferencing Chips


Designing AI/ML inferencing chips is emerging as a huge challenge due to the variety of applications and the highly specific power and performance needs for each of them. Put simply, one size does not fit all, and not all applications can afford a custom design. For example, in retail store tracking, it's acceptable to have a 5% or 10% margin of error for customers passing by a certain aisle... » read more

Training a ML model On An Intelligent Edge Device Using Less Than 256KB Memory


A new technical paper titled "On-Device Training Under 256KB Memory" was published by researchers at MIT and MIT-IBM Watson AI Lab. “Our study enables IoT devices to not only perform inference but also continuously update the AI models to newly collected data, paving the way for lifelong on-device learning. The low resource utilization makes deep learning more accessible and can have a bro... » read more

Simplifying AI Edge Deployment


Barrie Mullins, vice president of product at Flex Logix, explains how a programmable accelerator chip can simplify semiconductor design at the edge, where chips need to be high performance as well as low power, yet developing everything from scratch is too expensive and time-consuming. Programmability allows these systems to stay current with changes in algorithms, which can affect everything f... » read more

Novel In-Pixel-in-Memory (P2M) Paradigm for Edge Intelligence (USC)


A new technical paper titled "A processing-in-pixel-in-memory paradigm for resource-constrained TinyML applications" was published by researchers at University of Southern California (USC). According to the paper, "we propose a novel Processing-in-Pixel-in-memory (P2M) paradigm, that customizes the pixel array by adding support for analog multi-channel, multi-bit convolution, batch normaliza... » read more

← Older posts