SW-HW Framework: Graphic Rendering on RISC-V GPUs (Georgia Tech, Cal Poly)


A new technical paper titled "Skybox: Open-Source Graphic Rendering on Programmable RISC-V GPUs" was published by researchers at Georgia Tech, California Polytechnic State University-San Luis Obispo. Abstract Excerpt: "In this work, we present Skybox, a full-stack open-source GPU architecture with integrated software, compiler, hardware, and simulation environment, that enables end-to-end G... » read more

Hardware Accelerator For Fully Homomorphic Encryption


A technical paper titled "CraterLake: A Hardware Accelerator for Efficient Unbounded Computation on Encrypted Data" was published by researchers at MIT, IBM TJ Watson, SRI International, and University of Michigan. "We present CraterLake, the first FHE accelerator that enables FHE programs of unbounded size (i.e., unbounded multiplicative depth). Such computations require very large cipherte... » read more

RISC-V decoupled Vector Processing Unit (VPU) For HPC


A technical paper titled "Vitruvius+: An Area-Efficient RISC-V Decoupled Vector Coprocessor for High Performance Computing Applications" was published by researchers at Barcelona Supercomputing Center, Spain. "The maturity level of RISC-V and the availability of domain-specific instruction set extensions, like vector processing, make RISC-V a good candidate for supporting the integration of ... » read more

Neural Architecture & Hardware Accelerator Co-Design Framework (Princeton/ Stanford)


A new technical paper titled "CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework" was published by researchers at Princeton University and Stanford University. "Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either... » read more

HW Accelerator Architecture for MI Computation With Low Latency, Energy Efficient (MIT)


A new technical paper titled "Efficient Computation of Map-scale Continuous Mutual Information on Chip in Real Time" was published by researchers at MIT. Find the technical paper here. "In this paper, we introduce a new hardware accelerator architecture for MI computation that features a low-latency, energy-efficient MI compute core and an optimized memory subsystem that provides sufficie... » read more

Using Sparseloop in Hardware Accelerator Design Flows (MIT)


A technical paper titled "Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling" was published by researchers at MIT and NVIDIA.  The paper won "Distinguished Artifact Award" at the MICRO 2022 conference. Find the technical paper here.  Published 2022.  Project website is here and github here. Abstract: "In recent years, many accelerators have been proposed to effici... » read more

Convolutional Neural Networks: Co-Design of Hardware Architecture and Compression Algorithm


Researchers at Soongsil University (Korea) published "A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration." Abstract: "Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry. The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction... » read more

Gemmini: Open-source, Full-Stack DNN Accelerator Generator (DAC Best Paper)


This technical paper titled "Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration" was published jointly by researchers at UC Berkeley and a co-author from MIT.  The research was partially funded by DARPA and won DAC 2021 Best Paper. The paper presents Gemmini, "an open-source, full-stack DNN accelerator generator for DNN workloads, enabling end-to-e... » read more

Low Power HW Accelerator for FP16 Matrix Multiplications For Tight Integration Within RISC-V Cores


This new technical paper titled "RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs" was published by researchers at University of Bologna and ETH Zurich. According to their abstract: "One of the key stumbling stones is the need for parallel floating-point operations, which are considered unaffordable on sub-100 mW extre... » read more

CFU Playground: Significant Speedups & Design Space Exploration Between CPU & Accelerator


Technical paper titled "CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (tinyML) Acceleration on FPGAs," from Google, Purdue University and Harvard University. Abstract "We present CFU Playground, a full-stack open-source framework that enables rapid and iterative design of machine learning (ML) accelerators for embedded ML systems. Our toolchain tightly integr... » read more

← Older posts Newer posts →