FIR And Median Filter Accelerators In CodAL

5G is the latest generation of cellular networks using the 3rd Generation Partnership Project (3GPP) 5G New Radio air interface. Unlike previous generations of network (2G, 3G & 4G) which had a one-size-fits-all approach, 5G aims to address a wide range of very different applications. To flexibly support diverse quality of service requirements, network slicing is introduced to enable mul... » read more

Hyperscale HW Optimized Neural Architecture Search (Google)

A new technical paper titled "Hyperscale Hardware Optimized Neural Architecture Search" was published by researchers at Google, Apple, and Waymo. "This paper introduces the first Hyperscale Hardware Optimized Neural Architecture Search (H2O-NAS) to automatically design accurate and performant machine learning models tailored to the underlying hardware architecture. H2O-NAS consists of three ... » read more

FPGAs: Automated Framework For Architecture-Space Exploration of Approximate Accelerators

A technical paper titled "autoXFPGAs: An End-to-End Automated Exploration Framework for Approximate Accelerators in FPGA-Based Systems" was published (preprint) by researchers at TU Wien, Brno University of Technology, and NYUAD. Abstract "Generation and exploration of approximate circuits and accelerators has been a prominent research domain exploring energy-efficiency and/or performance... » read more

Algorithm HW Framework That Minimizes Accuracy Degradation, Data Movement, And Energy Consumption Of DNN Accelerators (Georgia Tech)

This new research paper titled "An Algorithm-Hardware Co-design Framework to Overcome Imperfections of Mixed-signal DNN Accelerators" was published by researchers at Georgia Tech. According to the paper's abstract, "In recent years, processing in memory (PIM) based mixed-signal designs have been proposed as energy- and area-efficient solutions with ultra high throughput to accelerate DNN com... » read more

A Framework For Ultra Low-Power Hardware Accelerators Using NNs For Embedded Time Series Classification

In embedded applications that use neural networks (NNs) for classification tasks, it is important to not only minimize the power consumption of the NN calculation, but of the whole system. Optimization approaches for individual parts exist, such as quantization of the NN or analog calculation of arithmetic operations. However, there is no holistic approach for a complete embedded system design ... » read more

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

Abstract: "Recent work demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). One key problem is the weights that are signed values. However, in a ReRAM crossbar, weights are stored as conductance of... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology

Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

SARA: Scaling a Reconfigurable Dataflow Accelerator

Yaqi Zhang, Nathan Zhang, Tian Zhao, Matt Vilim, Muhammad Shahbaz, Kunle Olukotun (Stanford) Abstract—"The need for speed in modern data-intensive workloads and the rise of “dark silicon” in the semiconductor industry are pushing for larger, faster, and more energy and areaefficient architectures, such as Reconfigurable Dataflow Accelerators (RDAs). Nevertheless, challenges remain in d... » read more

Challenges In Developing A New Inferencing Chip

Cheng Wang, co-founder and senior vice president of software and engineering at Flex Logix, sat down with Semiconductor Engineering to explain the process of bringing an inferencing accelerator chip to market, from bring-up, programming and partitioning to tradeoffs involving speed and customization.   SE: Edge inferencing chips are just starting to come to market. What challenges di... » read more

NN-Baton: DNN Workload Orchestration & Chiplet Granularity Exploration for Multichip Accelerators

"Abstract—The revolution of machine learning poses an unprecedented demand for computation resources, urging more transistors on a single monolithic chip, which is not sustainable in the Post-Moore era. The multichip integration with small functional dies, called chiplets, can reduce the manufacturing cost, improve the fabrication yield, and achieve die-level reuse for different system scales... » read more

← Older posts