A technical paper titled “Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets” was published by researchers at University of California Irvine.
“To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators. We develop an advanced scheduling framework for heterogeneous MCM accelerators that comprehensively consider complex heterogeneity and inter-chiplet pipelining. Our experiments using our framework on GPT-2 and ResNet-50 models on a 4-chiplet system have shown up to 2.2x and 1.9x increase in throughput and energy efficiency, compared to a monolithic accelerator with an optimized output-stationary dataflow.”
Find the technical paper here. Published December 2023 (preprint).
Odema, Mohanad, Hyoukjun Kwon, and Mohammad Abdullah Al Faruque. “Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets.” arXiv preprint arXiv:2312.09401 (2023).
Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Preparing For Commercial Chiplets
An expert discussion on what’s missing, what changes are underway, and why chiplets are increasingly necessary.
Leave a Reply