Machine Learning-Based IR Drop Prediction Approach


A new technical paper titled "Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate" was published by researchers at KTH Royal Institute of Technology and Ericsson Research. ABSTRACT "Accurate estimation of voltage drop (IR drop) in modern Application-Specific Integrated Circuits (ASICs) is highly time and resource demanding, due to the growing complex... » read more

Design Space For The Device-Circuit Codesign Of NVM-Based CIM Accelerators (TSMC)


A new technical paper titled "Assessing Design Space for the Device-Circuit Codesign of Nonvolatile Memory-Based Compute-in-Memory Accelerators" was published by TSMC researchers. Abstract "Unprecedented penetration of artificial intelligence (AI) algorithms has brought about rapid innovations in electronic hardware, including new memory devices. Nonvolatile memory (NVM) devices offer one s... » read more

Cradle-To-Grave Analysis Of The Carbon Footprint of AI Hardware (Google)


A new technical paper titled "Life-Cycle Emissions of AI Hardware: A Cradle-To-Grave Approach and Generational Trends" was published by researchers at Google. Abstract "Specialized hardware accelerators aid the rapid advancement of artificial intelligence (AI), and their efficiency impacts AI's environmental sustainability. This study presents the first publication of a comprehensive AI acc... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)


A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)


A new technical paper titled "WaferLLM: A Wafer-Scale LLM Inference System" was published by researchers at University of Edinburgh and Microsoft Research. Abstract "Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh-based architecture with large distributed on-chip memory (tens of GB in total) and ultr... » read more

Potential of Wireless Interconnects For Improving Performance And Flexibility Of Multi-Chip AI Accelerators


A new technical paper titled "Exploring the Potential of Wireless-enabled Multi-Chip AI Accelerators" was published by researchers at Universitat Politecnica de Catalunya. Abstract "The insatiable appetite of Artificial Intelligence (AI) workloads for computing power is pushing the industry to develop faster and more efficient accelerators. The rigidity of custom hardware, however, conflict... » read more

Power Delivery Challenges in 3D HI CIM Architectures for AI Accelerators (Georgia Tech)


A new technical paper titled "Co-Optimization of Power Delivery Network Design for 3D Heterogeneous Integration of RRAM-based Compute In-Memory Accelerators" was published by researchers at Georgia Tech. Abstract: "3D heterogeneous integration (3D HI) offers promising solutions for incorporating substantial embedded memory into cutting-edge analog compute-in-memory (CIM) AI accelerators, ad... » read more

Mixed-Precision DL Inference, Co-Designed With HW Accelerator DPU (Intel)


A new technical paper titled "StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign" was published by Intel. Abstract "In this paper, we propose StruM, a novel structured mixed-precision-based deep learning inference method, co-designed with its associated hardware accelerator (DPU), to address the escalating computational and memory demands of deep learning worklo... » read more

DeepSeek: Improving Language Model Reasoning Capabilities Using Pure Reinforcement Learning


A new technical paper titled "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" was published by DeepSeek. Abstract: "We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates rema... » read more

Analog Accelerator For AI/ML Training Workloads Using Stochastic Gradient Descent (Imperial College London)


A new technical paper titled "Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent" was published by researchers at Imperial College London. Abstract "The rapid proliferation of AI models, coupled with growing demand for edge deployment, necessitates the development of AI hardware that is both high-performance and energy-efficient. In this paper, w... » read more

← Older posts Newer posts →