GPU Microarchitecture Integrating Dedicated Matrix Units At The Cluster Level (UC Berkeley)


A new technical paper titled "Virgo: Cluster-level Matrix Unit Integration in GPUs for Scalability and Energy Efficiency" was published by UC Berkeley. Abstract "Modern GPUs incorporate specialized matrix units such as Tensor Cores to accelerate GEMM operations central to deep learning workloads. However, existing matrix unit designs are tightly coupled to the SIMT core, limiting the size a... » read more

Benefits Of The Ultra-Low Leakage Currents from IGZO TFTs For Neuromorphic Applications


A new technical paper titled "A tunable multi-timescale Indium-Gallium-Zinc-Oxide thin-film transistor neuron towards hybrid solutions for spiking neuromorphic applications" was published by researchers at imec, CSIC Universidad de Sevilla, and Sungkyunkwan University. Abstract "Spiking neural network algorithms require fine-tuned neuromorphic hardware to increase their effectiveness. Such ... » read more

LLMs In The High-Level Synthesis Design Flow


A new technical paper titled "Are LLMs Any Good for High-Level Synthesis?" was published by researchers at University of Arizona. Abstract "The increasing complexity and demand for faster, energy-efficient hardware designs necessitate innovative High-Level Synthesis (HLS) methodologies. This paper explores the potential of Large Language Models (LLMs) to streamline or replace the HLS proces... » read more

Characterizing Three Supercomputers: Multi-GPU Interconnect Performance


A new technical paper titled "Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects" was published by researchers at Sapienza University of Rome, University of Trento, Vrije Universiteit Amsterdam, ETH Zurich, CINECA, University of Antwerp, IBM Research Europe, HPE Cray, and NVIDIA. Abstract "Multi-GPU nodes are increasingly common in the rapidly evolving landscape... » read more

The Impact Of Simulation On The Carbon Footprint of Wafer Fab Equipment R&D


A new technical paper titled "Achieving Sustainability in the Semiconductor Industry: The Impact of Simulation and AI" was published by researchers at Lam Research. Abstract "Computational simulation has been used in the semiconductor industry since the 1950s to provide engineers and managers with a faster, more cost-effective method of designing semiconductors. With increased pressure in t... » read more

Electrochemical RAM Cross-Point Arrays For An Analog DL Accelerator


A technical paper titled “Retention-aware zero-shifting technique for Tiki-Taka algorithm-based analog deep learning accelerator” was published by researchers at Pohang University of Science and Technology, Korea University, and Kyungpook National University. "We present the fabrication of 4 K-scale electrochemical random-access memory (ECRAM) cross-point arrays for analog neural network... » read more

Survey of Energy Efficient PIM Processors


A new technical paper titled "Survey of Deep Learning Accelerators for Edge and Emerging Computing" was published by researchers at University of Dayton and the Air Force Research Laboratory. Abstract "The unprecedented progress in artificial intelligence (AI), particularly in deep learning algorithms with ubiquitous internet connected smart devices, has created a high demand for AI compu... » read more

MTJ-Based CRAM Array


A new technical paper titled "Experimental demonstration of magnetic tunnel junction-based computational random-access memory" was published by researchers at University of Minnesota and University of Arizona, Tucson. Abstract "The conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence because ... » read more

Co-optimizing HW Architecture, Memory Footprint, Device Placement And Per-Chip Operator Scheduling (Georgia Tech, Microsoft)


A technical paper titled “Integrated Hardware Architecture and Device Placement Search” was published by researchers at Georgia Institute of Technology and Microsoft Research. Abstract: "Distributed execution of deep learning training involves a dynamic interplay between hardware accelerator architecture and device placement strategy. This is the first work to explore the co-optimization ... » read more

6G And Beyond: Overall Vision And Survey of Research


A new 92 page technical paper titled "6G: The Intelligent Network of Everything -- A Comprehensive Vision, Survey, and Tutorial" was published by IEEE researchers at Finland's University of Oulu. Abstract "The global 6G vision has taken its shape after years of international research and development efforts. This work culminated in ITU-R's Recommendation on "IMT-2030 Framework". While the d... » read more

← Older posts Newer posts →