Cradle-To-Grave Analysis Of The Carbon Footprint of AI Hardware (Google)


A new technical paper titled "Life-Cycle Emissions of AI Hardware: A Cradle-To-Grave Approach and Generational Trends" was published by researchers at Google. Abstract "Specialized hardware accelerators aid the rapid advancement of artificial intelligence (AI), and their efficiency impacts AI's environmental sustainability. This study presents the first publication of a comprehensive AI acc... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)


A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Modeling and Simulation of NVM Technologies: Tutorial (TU Dormand, TU Dresden, KIT, FAU)


A new technical paper titled "Modeling and Simulating Emerging Memory Technologies: A Tutorial" was published by researchers at TU Dortmund, TU Dresden, Karlsruhe Institute of Technology (KIT) and FAU ErlangenNürnberg. "This tutorial presents a simulation toolchain through four detailed case studies, showcasing its applicability to various domains of system design, including hybrid main-mem... » read more

Uncore Frequency Scaling For Energy Optimization In Heterogeneous Systems (UIC, Argonne)


A new technical paper titled "Exploring Uncore Frequency Scaling for Heterogeneous Computing" was published by researchers at University of Illinois Chicago and Argonne National Laboratory. Abstract "High-performance computing (HPC) systems are essential for scientific discovery and engineering innovation. However, their growing power demands pose significant challenges, particularly as sys... » read more

Maximizing Energy Efficiency in Subthreshold RISC-V Cores (NTNU)


A new technical paper titled "Optimizing Energy Efficiency in Subthreshold RISC-V Cores" was published by researchers at Norwegian University of Science and Technology (NTNU). Abstract "Our goal in this paper is to understand how to maximize energy efficiency when designing standard-ISA processor cores for subthreshold operation. We hence develop a custom subthreshold library and use it to ... » read more

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)


A new technical paper titled "WaferLLM: A Wafer-Scale LLM Inference System" was published by researchers at University of Edinburgh and Microsoft Research. Abstract "Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh-based architecture with large distributed on-chip memory (tens of GB in total) and ultr... » read more

Power Delivery Challenges in 3D HI CIM Architectures for AI Accelerators (Georgia Tech)


A new technical paper titled "Co-Optimization of Power Delivery Network Design for 3D Heterogeneous Integration of RRAM-based Compute In-Memory Accelerators" was published by researchers at Georgia Tech. Abstract: "3D heterogeneous integration (3D HI) offers promising solutions for incorporating substantial embedded memory into cutting-edge analog compute-in-memory (CIM) AI accelerators, ad... » read more

Transistor Sizing Approach for OTA Circuits Using a Transformer Architecture


A  new technical paper titled "Accelerating OTA Circuit Design: Transistor Sizing Based on a Transformer Model and Precomputed Lookup Tables" was published by University Minnesota and Cadence. Abstract: "Device sizing is crucial for meeting performance specifications in operational transconductance amplifiers (OTAs), and this work proposes an automated sizing framework based on a transform... » read more

Optimization of the Inter-Chiplet Interconnect And The Chiplet Placement (ETH Zurich, U. of Bologna)


A new technical paper titled "PlaceIT: Placement-based Inter-Chiplet Interconnect Topologies" was published by researchers at ETH Zurich and University of Bologna. Abstract "2.5D integration technology is gaining traction as it copes with the exponentially growing design cost of modern integrated circuits. A crucial part of a 2.5D stacked chip is a low-latency and high-throughput inter-ch... » read more

Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)


A new technical paper titled "StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign" was published by Intel. Abstract "In this paper, we propose StruM, a novel structured mixed-precision-based deep learning inference method, co-designed with its associated hardware accelerator (DPU), to address the escalating computational and memory demands of deep learning worklo... » read more

← Older posts Newer posts →