System-HW Co-Design Approach Combines Mono3D DRAM, NMP, and GPU Acceleration (UCSD, Georgia Tech, UIUC, Illinois Tech)


A new technical paper titled "Stratum: System-Hardware Co-Design with Tiered Monolithic 3D-Stackable DRAM for Efficient MoE Serving" was published by researchers at UC San Diego, Georgia Tech, University of Illinois Urbana-Champaign and Illinois Institute of Technology. Abstract "As Large Language Models (LLMs) continue to evolve, Mixture of Experts (MoE) architecture has emerged as a preva... » read more