A technical paper titled “Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets” was published by researchers at University of California Irvine.
Abstract:
"To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based...
» read more