Accelerator Architecture For In-Memory Computation of CNN Inferences Using Racetrack Memory


A new technical paper titled "Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems" was published by researchers at National University of Singapore, A*STAR, Chinese Academy of Sciences, and Hong Kong University of Science and Technology. Abstract "Deep neural networks generate and process large volumes of data, posing challe... » read more

Stacking Persistent Embedded Memories Based On Oxide Transistors Upon GPGPU Platforms (Georgia Tech)


A new technical paper titled "CMOS+X: Stacking Persistent Embedded Memories based on Oxide Transistors upon GPGPU Platforms" was published by Georgia Tech. Abstract "In contemporary general-purpose graphics processing units (GPGPUs), the continued increase in raw arithmetic throughput is constrained by the capabilities of the register file (single-cycle) and last-level cache (high bandwidth... » read more

All-In-One Analog AI Accelerator With CMO/HfOx ReRAM Integrated Into The BEOL (IBM Research-Europe)


A new technical paper titled "All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices" was published by researchers at IBM Research-Europe. Abstract "Analog in-memory computing is an emerging paradigm designed to efficiently accelerate deep neural network workloads. Recent advancements have focused on either inference or training accelera... » read more

Review Paper: Wafer-Scale Accelerators Versus GPUs (UC Riverside)


A new technical paper titled "Performance, efficiency, and cost analysis of wafer-scale AI accelerators vs. single-chip GPUs" was published by researchers at UC Riverside. "This review compares wafer-scale AI accelerators and single-chip GPUs, examining performance, energy efficiency, and cost in high-performance AI applications. It highlights enabling technologies like TSMC’s chip-on-wafe... » read more

HBM Roadmap: Next-Gen High-Bandwidth Memory Architectures (KAIST’s TERALAB)


A new technical paper titled "HBM Roadmap Ver 1.7 Workshop" was published by researchers at KAIST’s TERALAB. The 371-page paper provides an overview of next-generation HBM architectures based on current technology trends, as well as many technology insights. Find the technical paper here or here.  Published June 2025. Advising Professor : Prof. Joungho Kim. Fig. 1: Thermal Manag... » read more

PCM-Based IMC Technology: Overview Of Materials, Device Physics, Design and Fabrication (IBM Research-Europe)


A new technical paper titled "Phase-Change Memory for In-Memory Computing" was published by researchers at IBM Research-Europe. "We review the current state of phase-change materials, PCM device physics, and the design and fabrication of PCM-based IMC chips. We also provide an overview of the application landscape and offer insights into future developments," states the paper. Find the te... » read more

Open-Source RISC-V Cores: Analysis Of Scalar and Superscalar Architectures And Out-Of-Order Machines


A new technical paper titled "Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution" was published by researchers at ETH Zurich, Università di Bologna and Univ. Grenoble Alpes, Inria. Abstract "Open-source RISC-V cores are increasingly demanded in domains like automotive and space, where achieving high instructions per cycle (IPC) throu... » read more

Hardware-Oriented Analysis of Multi-Head Latent Attention (MLA) in DeepSeek-V3 (KU Leuven)


A new technical paper titled "Hardware-Centric Analysis of DeepSeek's Multi-Head Latent Attention" was published by researchers at KU Leuven. Abstract "Multi-Head Latent Attention (MLA), introduced in DeepSeek-V2, improves the efficiency of large language models by projecting query, key, and value tensors into a compact latent space. This architectural change reduces the KV-cache size and s... » read more

V-NAND PUFs (Seoul National University, SK hynix)


A new technical paper titled "Concealable physical unclonable functions using vertical NAND flash memory" was published by researchers at Seoul National University and SK hynix. The paper proposes "a concealable PUF using V-NAND flash memory by generating PUF data through weak Gate-Induced-Drain-Leakage (GIDL) erase." Find the technical paper here. June 2025. Park, SH., Koo, RH., Yang,... » read more

Arithmetic Intensity In Decoding: A Hardware-Efficient Perspective (Princeton University)


A new technical paper titled "Hardware-Efficient Attention for Fast Decoding" was published by researchers at Princeton University. Abstract "LLM decoding is bottlenecked for large batches and long contexts by loading the key-value (KV) cache from high-bandwidth memory, which inflates per-token latency, while the sequential nature of decoding limits parallelism. We analyze the interplay amo... » read more

← Older posts