GPU Power Prediction Tool for AI Workloads (MIT, IBM)


A new technical paper, "EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads," was published by researchers at MIT and IBM Research. Abstract "As AI workloads drive increases in datacenter power consumption, accurate GPU power estimation is critical for proactive power management. However, existing power models face a scalability bottleneck not in the modeling tec... » read more

KAN Acceleration: Algorithm Hardware Co-Design Approach (Georgia Tech, National Tsing Hua Univ., TSMC)


A new technical paper titled "Hardware Acceleration of Kolmogorov-Arnold Network (KAN) in Large-Scale Systems" was published by researchers at Georgia Institute of Technology, National Tsing Hua University and TSMC. Abstract "Recent developments have introduced Kolmogorov-Arnold Networks (KAN), an innovative architectural paradigm capable of replicating conventional deep neural network (DNN... » read more

Machine Intelligence on Wireless Edge Networks with RF Analog Architecture (MIT, Duke)


A new technical paper titled "Machine Intelligence on Wireless Edge Networks" was published by researchers at MIT and Duke University. Abstract "Deep neural network (DNN) inference on power-constrained edge devices is bottlenecked by costly weight storage and data movement. We introduce MIWEN, a radio-frequency (RF) analog architecture that "disaggregates" memory by streaming weights wirele... » read more

DL Compiler for Efficiently Utilizing Inter-Core Connected AI Chips (UIUC, Microsoft)


A new technical paper titled "Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor" was published by researchers at UIUC and Microsoft Research. Abstract "As AI chips incorporate numerous parallelized cores to scale deep learning (DL) computing, inter-core communication is enabled recently by employing high-bandwidth and low-latency interconnect links on th... » read more

Insights From The AI Hardware & Edge AI Summit


By Ashish Darbari, Fabiana Muto, and Nicky Khodadad In today's rapidly changing technology landscape, artificial intelligence (AI) is more than a buzzword. It is transforming businesses and societies. From advances in scalable AI methodology to urgent calls for sustainability, the AI Hardware & Edge AI Summit recently held in London, sparked vibrant discussions that will determine the fu... » read more

On-Device Speaker Identification For Digital Television (DTV)


In recent years, the way we interact with our TVs has changed. Multiple button presses to navigate an on-screen keyboard have been replaced with direct interaction through our voices. While this has resulted in significant improvements to the Digital Television (DTV) user experience, more can be done to provide immersive and engaging experiences. Imagine you say, “recommend me a film” or... » read more

Improving ML-Based Device Modeling Using Variational Autoencoder Techniques


A technical paper titled “Improving Semiconductor Device Modeling for Electronic Design Automation by Machine Learning Techniques” was published by researchers at Commonwealth Scientific and Industrial Research Organisation (CSIRO), Peking University, National University of Singapore, and University of New South Wales. Abstract: "The semiconductors industry benefits greatly from the integ... » read more

Photonic-Electronic SmartNIC With Fast and Energy-Efficient Photonic Computing Cores (MIT)


A technical paper titled “Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference” was published by researchers at Massachusetts Institute of Technology (MIT). Abstract: "The massive growth of machine learning-based applications and the end of Moore's law have created a pressing need to redesign computing platforms. We propose Lightning, the first ... » read more

Low-Power Heterogeneous Compute Cluster For TinyML DNN Inference And On-Chip Training


A new technical paper titled "DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training" was published by researchers at University of Bologna and ETH Zurich. Abstract "On-chip deep neural network (DNN) inference and training at the Extreme-Edge (TinyML) impose strict latency, throughput, accuracy, and flexibility requirements. Heterogeneous clus... » read more

ISA and Microarchitecture Extensions Over Dense Matrix Engines to Support Flexible Structured Sparsity for CPUs (Georgia Tech, Intel Labs)


A technical paper titled "VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs" was published (preprint) by researchers at Georgia Tech and Intel Labs. Abstract: "Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible v... » read more

← Older posts