Arithmetic Intensity In Decoding: A Hardware-Efficient Perspective (Princeton University)


A new technical paper titled "Hardware-Efficient Attention for Fast Decoding" was published by researchers at Princeton University. Abstract "LLM decoding is bottlenecked for large batches and long contexts by loading the key-value (KV) cache from high-bandwidth memory, which inflates per-token latency, while the sequential nature of decoding limits parallelism. We analyze the interplay amo... » read more

Roadmap for AI HW Development, With The Role of Photonic Chips In Supporting Future LLMs (CUHK, NUS, UIUC, Berkeley)


A new technical paper titled "What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips" was published by researchers at The Chinese University of Hong Kong, National University of Singapore, University of Illinois Urbana-Champaign and UC Berkeley. Abstract "Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, t... » read more

Offline RL Framework That Dynamically Controls The GPU Clock And Server Fan Speed To Optimize Power Consumption And Computation Time (KAIST)


A new technical paper titled "Power Consumption Optimization of GPU Server With Offline Reinforcement Learning" was published by researchers at Korea Advanced Institute of Science and Technology (KAIST) and KT Research and Development Center. "Optimizing GPU server power consumption is complex due to the interdependence of various components. Conventional methods often involve trade-offs: in... » read more

Overview Of 103 Research Papers On Automatic SEM Image Analysis Algorithms For Semiconductor Defect Inspection (KU Leuven, Imec)


A new technical paper titled "Scanning electron microscopy-based automatic defect inspection for semiconductor manufacturing: a systematic review" was published by researchers at KU Leuven and imec. "We identified, categorized, and discussed automatic defect inspection algorithms that analyze scanning electron microscopy (SEM) images for semiconductor manufacturing (SM). This is a topic of c... » read more

Energy-Aware DL: The Interplay Between NN Efficiency And Hardware Constraints (Imperial College London, Cambridge)


A new technical paper titled "Energy-Aware Deep Learning on Resource-Constrained Hardware" was published by researchers at Imperial College London and University of Cambridge. Abstract "The use of deep learning (DL) on Internet of Things (IoT) and mobile devices offers numerous advantages over cloud-based processing. However, such devices face substantial energy constraints to prolong batte... » read more

Cache Side-Channel Attacks On LLMs (MITRE, WPI)


A new technical paper titled "Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models" was published by researchers at MITRE and Worcester Polytechnic Institute. Abstract "Side-channel attacks on shared hardware resources increasingly threaten confidentiality, especially with the rise of Large Language Models (LLMs). In this work, we introduce Spill The... » read more

Customizing A LLM Model For VHDL Design of High-Performance MPUs (IBM)


A new technical paper titled "Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors" was published by researchers at IBM. Abstract "The use of Large Language Models (LLMs) in hardware design has taken off in recent years, principally through its incorporation in tools that increase chip designer productivity. There has been considerable discussion about the ... » read more

Embedded GPU: An Open-Source And Configurable RISC-V GPU Platform for TinyAI Devices (EPFL)


A new technical paper titled "e-GPU: An Open-Source and Configurable RISC-V Graphic Processing Unit for TinyAI Applications" was published by researchers at EPFL. Abstract "Graphics processing units (GPUs) excel at parallel processing, but remain largely unexplored in ultra-low-power edge devices (TinyAI) due to their power and area limitations, as well as the lack of suitable programming... » read more

Safety Architecture and Approaches for Automotive SW And HW Including ASIL D And AI/ML (Mercedes-Benz, U. Of Washington)


A new technical paper titled "Key Safety Design Overview in AI-driven Autonomous Vehicles" was published by researchers at Mercedes-Benz Research and Development North America and University of Washington . Abstract "With the increasing presence of autonomous SAE level 3 and level 4, which incorporate artificial intelligence software, along with the complex technical challenges they present... » read more

Main Applications And Corresponding Requirements For IMC With RRAM Devices


A new technical paper titled "Resistive Switching Random-Access Memory (RRAM): Applications and Requirements for Memory and Computing" was published by researchers at Politecnico di Milano, IUNET and Hewlett-Packard Labs. Abstract "In the information age, novel hardware solutions are urgently needed to efficiently store and process increasing amounts of data. In this scenario, memory device... » read more

← Older posts Newer posts →