A new technical paper titled "Stratum: System-Hardware Co-Design with Tiered Monolithic 3D-Stackable DRAM for Efficient MoE Serving" was published by researchers at UC San Diego, Georgia Tech, University of Illinois Urbana-Champaign and Illinois Institute of Technology.
Abstract
"As Large Language Models (LLMs) continue to evolve, Mixture of Experts (MoE) architecture has emerged as a preva...
» read more