Chiplet-Based NPUs to Accelerate Vehicular AI Perception Workloads


A new technical paper titled "Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception" was published by researchers at UC Irvine. Abstract "We study the application of emerging chiplet-based Neural Processing Units to accelerate vehicular AI perception workloads in constrained automotive settings. The motivation stems from how chiplets technology i... » read more

STCO for Dense Edge Architectures using 3D Integration and NVM (imec,, et al.)


A new technical paper titled "System-Technology Co-Optimization for Dense Edge Architectures using 3D Integration and Non-Volatile Memory" was published by researchers at imec, INESC-ID, Université Libre de Bruxelles, et al. "In this paper, we present an system-technology co-optimization (STCO) framework that interfaces with workload-driven system scaling challenges and physical design-enab... » read more

Gate-All-Around: TCAD and DTCO Approach To Evaluate Power and Performance (imec, et al.)


A new technical paper titled "Exploring GAA-Nanosheet, Forksheet and GAA-Forksheet Architectures: a TCAD-DTCO Study at 90 nm & 120 nm Cell Height" was published by imec, Huawei Technologies and Global TCAD Solutions. Abstract "This study presents a Technology Computer Aided Design (TCAD) and comprehensive Design-Technology Co-Optimization (DTCO) approach to evaluate and enhance power an... » read more

Pooling CPU Memory for LLM Inference With Lower Latency and Higher Throughput (UC Berkeley)


A new technical paper titled "Pie: Pooling CPU Memory for LLM Inference" was published by researchers at UC Berkeley. Abstract "The rapid growth of LLMs has revolutionized natural language processing and AI analysis, but their increasing size and memory demands present significant challenges. A common solution is to spill over to CPU memory; however, traditional GPU-CPU memory swapping ofte... » read more

Backpropagation Algorithm On Neuromorphic Spiking HW (U. Of Zurich, ETH Zurich, LANL)


A new technical paper titled "The backpropagation algorithm implemented on spiking neuromorphic hardware" was published by University of Zurich, ETH Zurich, Los Alamos National Laboratory, Royal Institution, London, et al. "This study presents a neuromorphic, spiking backpropagation algorithm based on synfire-gated dynamical information coordination and processing implemented on Intel’s Lo... » read more

Sustainable Hardware Specialization Through Reconfigurable Logic (NUS, Ghent Univ.)


A  new technical paper titled "Sustainable Hardware Specialization" was published by researchers at National University of Singapore and Ghent University. "We explore sustainable hardware specialization through reconfigurable logic that has the potential to drastically reduce the environmental footprint compared to a sea of accelerators by amortizing its embodied footprint across multiple a... » read more

Distributed Shared Memory That Enlarges Effective Memory Capacity Through Intelligent Tiered DRAM and Storage Management (IIT)


A new technical paper titled "MegaMmap: Blurring the Boundary Between Memory and Storage for Data-Intensive Workloads" was published by researchers at Illinois Institute of Technology. "In this work, we propose MegaMmap: a software distributed shared memory (DSM) that enlarges effective memory capacity through intelligent tiered DRAM and storage management. MegaMmap provides workload-aware d... » read more

Dedicated 3D Accelerator Specifically Designed For Emerging Spiking Transformers


A new technical paper titled "Spiking Transformer Hardware Accelerators in 3D Integration" was published by researchers at UC Santa Barbara, Georgia Tech and Burapha University. "Recognizing the current lack of dedicated hardware support for spiking transformers, this paper presents the first work on 3D spiking transformer hardware architecture and design methodology. We present an architect... » read more

Non-Stateful Logic Gates in ReRAM (RWTH Aachen, FZJ)


A new technical paper titled "Experimental Verification and Evaluation of Non-Stateful Logic Gates in Resistive RAM" was published by researchers at RWTH Aachen University and Forschungszentrum Jülich GmbH (FZJ). Abstract "Resistively switching, non-volatile memory devices facilitate new logic paradigms by combining storage and processing elements. Several non-stateful concepts such as Sco... » read more

CXL-Based Heterogeneous Systems: How to Optimize and Future Directions (UCSD, Samsung, SK Hynix)


A new technical paper titled "The Hitchhiker’s Guide to Programming and Optimizing CXL-Based Heterogeneous Systems" was published by researchers at UC San Diego, Samsung, SK hynix. Abstract "We present a thorough analysis of the use of CXL-based heterogeneous systems. We built a cluster of server systems that combines different vendor's CPUs and various types of CXL devices. We further ... » read more

← Older posts