Systematic Analysis of CPU-Induced Slowdowns in Multi-GPU LLM Inference (Georgia Tech)

By Technical Paper Link - 27 Mar, 2026 - Comments: 0

A new technical paper, "Characterizing CPU-Induced Slowdowns in Multi-GPU LLM Inference," was published by the Georgia Institute of Technology. Abstract "Large-scale machine learning workloads increasingly rely on multi-GPU systems, yet their performance is often limited by an overlooked component: the CPU. Through a detailed study of modern large language model (LLM) inference and servin... » read more

Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers

By Technical Paper Link - 02 Jul, 2021 - Comments: 0

Harini Muthukrishnan (U of Michigan); David Nellans, Daniel Lustig (NVIDIA); Jeffrey A. Fessler, Thomas Wenisch (U of Michigan). Abstract—"Despite continuing research into inter-GPU communication mechanisms, extracting performance from multiGPU systems remains a significant challenge. Inter-GPU communication via bulk DMA-based transfers exposes data transfer latency on the GPU’s critical... » read more

tag: multi-GPU

Systematic Analysis of CPU-Induced Slowdowns in Multi-GPU LLM Inference (Georgia Tech)

Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers

Trending Articles

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

Chip Industry Week In Review

Agentic AI Is Changing Data Center Architectures

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Knowledge Centers
Entities, people and technologies explored

Related Articles

Advanced Packaging Limits Come Into Focus

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

CPO Is Extending The Limits Of What’s Possible In AI Data Centers

Silicon Photonics Lights The Way To More Efficient Data Centers

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: multi-GPU

Systematic Analysis of CPU-Induced Slowdowns in Multi-GPU LLM Inference (Georgia Tech)

Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers

Trending Articles

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

Chip Industry Week In Review

Agentic AI Is Changing Data Center Architectures

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Knowledge Centers Entities, people and technologies explored

Related Articles

Advanced Packaging Limits Come Into Focus

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

CPO Is Extending The Limits Of What’s Possible In AI Data Centers

Silicon Photonics Lights The Way To More Efficient Data Centers

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored