Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)


A new technical paper titled "StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign" was published by Intel. Abstract "In this paper, we propose StruM, a novel structured mixed-precision-based deep learning inference method, co-designed with its associated hardware accelerator (DPU), to address the escalating computational and memory demands of deep learning worklo... » read more

Processing-Using-DRAM: Attaining High-Performance Via Dynamic Precision Bit-Serial Arithmetic (ETH Zurich, et al.)


A new technical paper titled "Proteus: Achieving High-Performance Processing-Using-DRAM via Dynamic Precision Bit-Serial Arithmetic" was published by researchers at ETH Zurich, Cambridge University, Universidad de Córdoba, Univ. of Illinois Urbana-Champaign and NVIDIA Research. Abstract "Processing-using-DRAM (PUD) is a paradigm where the analog operational properties of DRAM structures ... » read more

Apple CPU Attacks: SLAP and FLOP (Georgia Tech, Ruhr University Bochum)


Two technical papers were published by researchers at Georgia Tech and Ruhr University Bochum detailing CPU side-channel attack vulnerabilities on Apple devices that could reveal confidential data. FLOP: Breaking the Apple M3 CPU via False Load Output Predictions"  Authors: Jason Kim, Jalen Chuang, Daniel Genkin and Yuval Yarom 2025. "We present FLOP, another speculative execution att... » read more

Topology And Connection Architecture Of CXL Pooling Systems (Microsoft, Columbia)


A new technical paper titled "Octopus: Scalable Low-Cost CXL Memory Pooling" was published by researchers at University of Washington, Microsoft Azure and Columbia University. Abstract "Compute Express Link (CXL) is widely-supported interconnect standard that promises to enable memory disaggregation in data centers. CXL allows for memory pooling, which can be used to create a shared memory ... » read more

SRAM With Mixed Signal Logic With Noise Immunity in 3nm Nanosheet (IBM)


A new technical paper titled "SRAM and Mixed-Signal Logic With Noise Immunity in 3nm Nano-Sheet Technology" was published by researchers at IBM T. J. Watson Research Center and IBM. Abstract "A modular 4.26Mb SRAM based on a 82Kb/block structure with mixed signal logic is fabricated, characterized, and demonstrated with full functionality in a 3nm nanosheet (NS) technology. Designed macros ... » read more

Analog Accelerator For AI/ML Training Workloads Using Stochastic Gradient Descent (Imperial College London)


A new technical paper titled "Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent" was published by researchers at Imperial College London. Abstract "The rapid proliferation of AI models, coupled with growing demand for edge deployment, necessitates the development of AI hardware that is both high-performance and energy-efficient. In this paper, w... » read more

New Class Of Memory: Managed-Retention Memory or MRM (Microsoft Research)


A new technical paper titled "Managed-Retention Memory: A New Class of Memory for the AI Era" was published by researchers at Microsoft. Abstract "AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read band... » read more

Impact of Extremely Low Temperatures On The 5nm SRAM Array Size and Performance


A new technical paper titled "Novel Trade-offs in 5 nm FinFET SRAM Arrays at Extremely Low Temperatures" was published by researchers at University of Stuttgart, IIT Kanpur, National Yang Ming Chiao Tung University, Khalifa University, and TU Munich. Abstract "Complementary metal–oxide–semiconductor (CMOS)-based computing promises drastic improvement in performance at extremely low temp... » read more

Design-Space Analysis of M3D FPGA With BEOL Configuration Memories (Georgia Tech, UCLA)


A new technical paper titled "Monolithic 3D FPGAs Utilizing Back-End-of-Line Configuration Memories" was published by researchers at Georgia Tech and UCLA. Abstract "This work presents a novel monolithic 3D (M3D) FPGA architecture that leverages stackable back-end-of-line (BEOL) transistors to implement configuration memory and pass gates, significantly improving area, latency, and power ef... » read more

Design Space for the Device-Circuit Codesign of NVM-Based CIM Accelerators (TSMC)


A new technical paper/mini-review titled "Assessing Design Space for the Device-Circuit Codesign of Nonvolatile Memory-Based Compute-in-Memory Accelerators" was published by researchers at TSMC and National Tsing Hua University. Abstract "Unprecedented penetration of artificial intelligence (AI) algorithms has brought about rapid innovations in electronic hardware, including new memory devi... » read more

← Older posts