Reliability Extension Architecture For Cost-Effective HBM (RPI, ScaleFlux, IBM TJ Watson)


A new technical paper titled "Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference" was published by researchers at Rensselaer Polytechnic Institute, ScaleFlux and IBM T.J. Watson Research Center. Abstract "LLM inference is increasingly memory bound, and HBM cost per GB dominates system cost. Current HBM stacks include short on-die ECC that tightens binning, raise... » read more