Home
TECHNICAL PAPERS

Optimizing End-to-End Communication And Workload Partitioning In MCM Accelerators (Georgia Tech)

popularity

A new technical paper titled “MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules” was published by researchers at Georgia Tech.

Abstract
“Increasing AI computing demands and slowing transistor scaling have led to the advent of Multi-Chip-Module (MCMs) based accelerators. MCMs enable cost-effective scalability, higher yield, and modular reuse by partitioning large chips into smaller chiplets. However, MCMs come at an increased communication cost, which requires critical analysis and optimization. This paper makes three main contributions: (i) an end-to-end, off-chip congestion-aware and packaging-adaptive analytical framework for detailed analysis, (ii) hardware software co-optimization incorporating diagonal links, on-chip redistribution, and non-uniform workload partitioning to optimize the framework, and (iii) using metaheuristics (genetic algorithms, GA) and mixed integer quadratic programming (MIQP) to solve the optimized framework. Experimental results demonstrate significant performance improvements for CNNs and Vision Transformers, showcasing up to 1.58x and 2.7x EdP (Energy delay Product) improvement using GA and MIQP, respectively.”

Find the technical paper here. April 2025.

arXiv:2505.00041
Authors: Ritik Raj, Shengjie Lin, Willam Won, Tushar Krishna



Leave a Reply


(Note: This name will be displayed publicly)