Scalable And Energy Efficient Solution for Hardware-Based ANNs (KAUST, NUS)


A new technical paper titled "Synaptic and neural behaviours in a standard silicon transistor" was published by researchers at KAUST and National University of Singapore. Abstract "Hardware implementations of artificial neural networks (ANNs)—the most advanced of which are made of millions of electronic neurons interconnected by hundreds of millions of electronic synapses—have achieved ... » read more

GPU Analysis Identifying Performance Bottlenecks That Cause Throughput Plateaus In Large-Batch Inference


A new technical paper titled "Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference" was published by researchers at Barcelona Supercomputing Center, Universitat Politecnica de Catalunya, and IBM Research. Abstract "Large language models have been widely adopted across different tasks, but their auto-regressive generation nature often leads to inefficient resource util... » read more

HW Implementation Of An ONN Coupled By A ReRAM Crossbar Array (IBM, TU Eindhoven)


A new technical paper titled "Hardware Implementation of Ring Oscillator Networks Coupled by BEOL Integrated ReRAM for Associative Memory Tasks" was published by researchers at IBM Research Europe and Eindhoven University of Technology. Abstract "We demonstrate the first hardware implementation of an oscillatory neural network (ONN) utilizing resistive memory (ReRAM) for coupling elements. ... » read more

Energy-Efficient Scalable Silicon Photonic Platform For AI Accelerator HW


A new technical paper titled "Large-Scale Integrated Photonic Device Platform for Energy-Efficient AI/ML Accelerators" was published by researchers at HP Labs, IIT Madras, Microsoft Research and University of Michigan. Abstract "The convergence of deep learning and Big Data has spurred significant interest in developing novel hardware that can run large artificial intelligence (AI) workload... » read more

The Optical Implementation of Backpropagation (Oxford, Lumai)


A technical paper titled "Training neural networks with end-to-end optical backpropagation" was published by researchers at University of Oxford and Lumai Ltd. Abstract "Optics is an exciting route for the next generation of computing hardware for machine learning, promising several orders of magnitude enhancement in both computational speed and energy efficiency. However, reaching the full... » read more

Machine Learning-Based IR Drop Prediction Approach


A new technical paper titled "Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate" was published by researchers at KTH Royal Institute of Technology and Ericsson Research. ABSTRACT "Accurate estimation of voltage drop (IR drop) in modern Application-Specific Integrated Circuits (ASICs) is highly time and resource demanding, due to the growing complex... » read more

Design Space For The Device-Circuit Codesign Of NVM-Based CIM Accelerators (TSMC)


A new technical paper titled "Assessing Design Space for the Device-Circuit Codesign of Nonvolatile Memory-Based Compute-in-Memory Accelerators" was published by TSMC researchers. Abstract "Unprecedented penetration of artificial intelligence (AI) algorithms has brought about rapid innovations in electronic hardware, including new memory devices. Nonvolatile memory (NVM) devices offer one s... » read more

Cradle-To-Grave Analysis Of The Carbon Footprint of AI Hardware (Google)


A new technical paper titled "Life-Cycle Emissions of AI Hardware: A Cradle-To-Grave Approach and Generational Trends" was published by researchers at Google. Abstract "Specialized hardware accelerators aid the rapid advancement of artificial intelligence (AI), and their efficiency impacts AI's environmental sustainability. This study presents the first publication of a comprehensive AI acc... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)


A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Wafer-Scale Computing for LLMs (U. of Edinburgh, Microsoft)


A new technical paper titled "WaferLLM: A Wafer-Scale LLM Inference System" was published by researchers at University of Edinburgh and Microsoft Research. Abstract "Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh-based architecture with large distributed on-chip memory (tens of GB in total) and ultr... » read more

← Older posts Newer posts →