Efficiently Process Large RM Datasets In Underlying Memory Pool, Disaggregated Over CXL (KAIST)


A technical paper titled "Failure Tolerant Training with Persistent Memory Disaggregation over CXL" was published (preprint) by researchers at KAIST and Panmnesia. "TRAININGCXL can efficiently process large-scale recommendation datasets in the pool of disaggregated memory while making training fault tolerant with low overhead," states the paper. Find the technical paper here. or here (IEE... » read more

Interop Shift Left: Using Pre-Silicon Simulation for Emerging Standards


By Martin James, Gary Dick, and Arif Khan, Cadence with Suhas Pai and Brian Rea, Intel The Compute Express Link™ (CXL™) 2.0 specification, released in 2020, accompanies the latest PCI Express (PCIe) 5.0 specification to provide a path to high-bandwidth, cache-coherent, low-latency transport for many high-bandwidth applications such as artificial intelligence, machine learning, ... » read more

Von Neumann Is Struggling


In an era dominated by machine learning, the von Neumann architecture is struggling to stay relevant. The world has changed from being control-centric to one that is data-centric, pushing processor architectures to evolve. Venture money is flooding into domain-specific architectures (DSA), but traditional processors also are evolving. For many markets, they continue to provide an effective s... » read more

An Increasingly Complicated Relationship With Memory


The relationship between a processor and its memory used to be quite simple, but in modern SoCs there are multiple heterogeneous processors and accelerators, each needing a different means of accessing memory for maximum efficiency. Compromises are being made in order to preserve the unified programming model of the past, but the pressures are increasing for some fundamental changes. It does... » read more

Data Centers Turn To New Memories


DRAM extensions and alternatives are starting to show up inside of data centers as the volume of data being processed, stored and accessed continues to skyrocket. This is having a big impact on the architecture of data centers, where the goal now is to move processing much closer to the data and to reduce latency everywhere. Memory has always been a key piece of the Von Neumann compute archi... » read more