Optimizing LLM Training Under GPU Memory Constraints (Argonne, RIT)


A new technical paper titled "MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall" was published by researchers at Argonne National Laboratory and Rochester Institute of Technology. Abstract "Training LLMs larger than the aggregated memory of multiple GPUs is increasingly necessary due to the faster growth of LLM sizes compared to GPU memory. To... » read more

Freeing Up Near-Memory Capacity For Cache Using Compression Techniques In A Flat Hybrid-Memory Architecture


A technical paper titled “HMComp: Extending Near-Memory Capacity using Compression in Hybrid Memory” was published by researchers at Chalmers University of Technology and ZeroPoint Technologies. Abstract: "Hybrid memories, especially combining a first-tier near memory using High-Bandwidth Memory (HBM) and a second-tier far memory using DRAM, can realize a large and low cost, high-bandwi... » read more

Taking Steps Toward Hybrid Memory


What is the memory subsystem of the future, and how do we get there? Since our Hybrid Memory research program began, Rambus Labs and its industry partners and collaborators have made significant progress under the banner of OpenPOWER and OpenCAPI Foundations, an open development community based on the POWER microprocessor (mP) architecture. Rambus Labs is using the Wistron POWER9 systems’ Ope... » read more