GPUs: Bandit Based Framework To Dynamically Reduce Energy Consumption


A new technical paper titled "Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach" was published by researchers at Illinois Institute of Technology, Argonne National Lab and Emory University. Abstract "Energy consumption has become a critical design metric and a limiting factor in the development of future computing architectures, from small wearable devices to large-scale lea... » read more

Thermal Modeling For 2.5D And 3D Integrated Chiplets


A new technical paper titled "MFIT: Multi-Fidelity Thermal Modeling for 2.5D and 3D Multi-Chiplet Architectures" was published by researchers at University of Wisconsin–Madison, Washington State University, and University of Ulsan. Abstract: "Rapidly evolving artificial intelligence and machine learning applications require ever-increasing computational capabilities, while monolithic 2D d... » read more

PIO on Current HW Outperforms DMA Over a Range of Payload Sizes In A Number of Different Applications (ETH Zurich)


A new technical paper titled "Rethinking Programmed I/O for Fast Devices, Cheap Cores, and Coherent Interconnects" was published by researchers at ETH Zurich. Abstract: "Conventional wisdom holds that an efficient interface between an OS running on a CPU and a high-bandwidth I/O device should be based on Direct Memory Access (DMA), descriptor rings, and interrupts: DMA offloads transfers fr... » read more

Survey: HW SW Co-Design Approaches Tailored to LLMs


A new technical paper titled "A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models" was published by researchers at Duke University and Johns Hopkins University. Abstract "The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language proce... » read more

Strain Engineering in 2D FETs (UCSB)


A new technical paper titled "Strain engineering in 2D FETs: Physics, status, and prospects" was published by researchers at UC Santa Barbara. "In this work, we explore the physics and evaluate the merits of strain engineering in two-dimensional van der Waals semiconductor-based FETs (field-effect-transistors) using DFT (density functional theory) to determine the modulation of the channel m... » read more

TFETs: Design and Operation, Including Material Selection and Simulation Methods


A new technical paper titled "Multiscale Simulation and Machine Learning Facilitated Design of Two-Dimensional Nanomaterials-Based Tunnel Field-Effect Transistors: A Review" was published by researchers at University of Chicago and Argonne National Lab. Abstract "Traditional transistors based on complementary metal-oxide-semiconductor (CMOS) and metal-oxide-semiconductor field-effect transi... » read more

Multi-Node, Virtualized Neuromorphic Architecture


A new technical paper titled "NeuroVM: Dynamic Neuromorphic Hardware Virtualization" was published by researchers at Stanford University, UT Austin and Temsa Research & Development Center. Abstract "This paper introduces a novel approach in neuromorphic computing, integrating heterogeneous hardware nodes into a unified, massively parallel architecture. Our system transcends traditional ... » read more

Review Paper: Challenges Required To Bring the Energy Consumption Down in Microelectronics (Rice, UC Berkeley, Georgia Tech, Et al.)


A new review article titled "Roadmap on low-power electronics" by researchers at Rice University, UC Berkeley, Georgia Tech, TSMC, Intel, Harvard, et al. This roadmap to energy efficient electronics written by numerous collaborators covers materials, modeling, architectures, manufacturing, metrology and more. Find the technical paper here. September 2024. Ramamoorthy Ramesh, Sayeef Sal... » read more

Fine-Grained Functional Partitioning For Low Level SRAM Cache in 3D-IC designs (imec)


A new technical paper titled "Towards Fine-grained Partitioning of Low-level SRAM Caches for Emerging 3D-IC Designs" was published by researchers at imec. "We propose a partitioning of low-level (faster access) caches in 3D using an Array Under CMOS (AuC) technology paradigm. Our study focuses on partitioning and optimization of SRAM bit-cells and peripheral circuits, enabling heterogeneous ... » read more

Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)


A new technical paper titled "Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole" was published by researchers at IBM Research. At the IEEE High Performance Extreme Computing (HPEC) Virtual Conference in September 2024, new performance results for their AIU NorthPole AI inference accelerator chip were presented on a 3-billion-parameter Granite LLM. ... » read more

← Older posts Newer posts →