GPU Analysis Identifying Performance Bottlenecks That Cause Throughput Plateaus In Large-Batch Inference


A new technical paper titled "Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM Inference" was published by researchers at Barcelona Supercomputing Center, Universitat Politecnica de Catalunya, and IBM Research. Abstract "Large language models have been widely adopted across different tasks, but their auto-regressive generation nature often leads to inefficient resource util... » read more

Silicon Reimagined: New Foundations For The Age of AI


The semiconductor industry is undergoing a pivotal transformation driven by the rise of artificial intelligence (AI) and the slowing of traditional Moore’s Law scaling.  In this comprehensive 42 page report, several key trends shaping the industry’s future are highlighted. The push toward more specialized architectures tailored for specific workloads, particularly in AI. The critic... » read more

Application Of External CFD Modeling In Data Center Design


Rising IT densities and AI workloads demand smarter heat management and equipment placement. This paper "Application of External CFD Modeling in Data Center Design" explores how external computational fluid dynamics (CFD) modeling provides crucial insights by resolving airflow patterns around buildings. Why Choose External CFD Modeling? Recommended by The Green Grid, it helps predict: ... » read more

Workload-Specific Data Movements Across AI Workloads in Multi-Chiplet AI Accelerators


A new technical paper titled "Communication Characterization of AI Workloads for Large-scale Multi-chiplet Accelerators" was published by researchers at Universitat Politecnica de Catalunya. Abstract "Next-generation artificial intelligence (AI) workloads are posing challenges of scalability and robustness in terms of execution time due to their intrinsic evolving data-intensive characteris... » read more

Scheduling Multi-Model AI Workloads On Heterogeneous MCM Accelerators (UC Irvine)


A technical paper titled “SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators” was published by researchers at University of California Irvine. Abstract: "Emerging multi-model workloads with heavy models like recent large language models significantly increased the compute and memory demands on hardware. To address such increasing demands, designin... » read more

Ferroelectric Memory-Based IMC for ML Workloads


A new technical paper titled "Ferroelectric capacitors and field-effect transistors as in-memory computing elements for machine learning workloads" was published by researchers at Purdue University. Abstract "This study discusses the feasibility of Ferroelectric Capacitors (FeCaps) and Ferroelectric Field-Effect Transistors (FeFETs) as In-Memory Computing (IMC) elements to accelerate mach... » read more

Chiplets For Generative AI Workloads: Challenges in both HW and SW


A new technical paper titled "Challenges and Opportunities to Enable Large-Scale Computing via Heterogeneous Chiplets" was published by researchers at University of Pittsburgh, Lightelligence, and Meta. Abstract "Fast-evolving artificial intelligence (AI) algorithms such as large language models have been driving the ever-increasing computing demands in today's data centers. Heterogeneous c... » read more