HW Accelerator Architecture for MI Computation With Low Latency, Energy Efficient (MIT)


A new technical paper titled "Efficient Computation of Map-scale Continuous Mutual Information on Chip in Real Time" was published by researchers at MIT. Find the technical paper here. "In this paper, we introduce a new hardware accelerator architecture for MI computation that features a low-latency, energy-efficient MI compute core and an optimized memory subsystem that provides sufficie... » read more

Using Sparseloop in Hardware Accelerator Design Flows (MIT)


A technical paper titled "Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling" was published by researchers at MIT and NVIDIA.  The paper won "Distinguished Artifact Award" at the MICRO 2022 conference. Find the technical paper here.  Published 2022.  Project website is here and github here. Abstract: "In recent years, many accelerators have been proposed to effici... » read more

Convolutional Neural Networks: Co-Design of Hardware Architecture and Compression Algorithm


Researchers at Soongsil University (Korea) published "A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration." Abstract: "Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry. The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction... » read more

Gemmini: Open-source, Full-Stack DNN Accelerator Generator (DAC Best Paper)


This technical paper titled "Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration" was published jointly by researchers at UC Berkeley and a co-author from MIT.  The research was partially funded by DARPA and won DAC 2021 Best Paper. The paper presents Gemmini, "an open-source, full-stack DNN accelerator generator for DNN workloads, enabling end-to-e... » read more

Low Power HW Accelerator for FP16 Matrix Multiplications For Tight Integration Within RISC-V Cores


This new technical paper titled "RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs" was published by researchers at University of Bologna and ETH Zurich. According to their abstract: "One of the key stumbling stones is the need for parallel floating-point operations, which are considered unaffordable on sub-100 mW extre... » read more

CFU Playground: Significant Speedups & Design Space Exploration Between CPU & Accelerator


Technical paper titled "CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (tinyML) Acceleration on FPGAs," from Google, Purdue University and Harvard University. Abstract "We present CFU Playground, a full-stack open-source framework that enables rapid and iterative design of machine learning (ML) accelerators for embedded ML systems. Our toolchain tightly integr... » read more

Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data


Abstract "In recent years there has been an increase in the number of research and developments in deep learning solutions for object detection applied to driverless vehicles. This application benefited from the growing trend felt in innovative perception solutions, such as LiDAR sensors. Currently, this is the preferred device to accomplish those tasks in autonomous vehicles. There is a bro... » read more

Reservoir Computing based on Mutually Injected Phase Modulated Semiconductor Lasers as a Monolithic Integrated Hardware Accelerator


Abstract: "In this paper we propose and numerically study a neuromorphic computing scheme that applies delay-based reservoir computing in a laser system consisting of two mutually coupled phase modulated lasers. The scheme can be monolithic integrated in a straightforward manner and alleviates the need for external optical injection, as the data can be directly applied on the on-chip phase m... » read more

Newer posts →