From data centers to endpoints, the demand for more memory is reshaping traditional architectures.
Memory is an integral component in every computer system, from the smartphones in our pockets to the giant data centers powering the world’s leading-edge AI applications. As AI continues to rise in reach and complexity, the demand for more memory from data center to endpoints is reshaping the industry’s requirements and traditional approaches to memory architectures.
According to OpenAI, the amount of compute used in the largest AI training has increased at a rate of 10X per year since 2012. One compelling example illustrating this voracious need for more memory is OpenAI’s very own ChatGPT, the most talked about large language model (LLM) of this year. When it was first released to the public in November 2022, GPT-3 was built using 175 billion parameters. GPT-4, released just a few months after, is reported to use upwards of 1.5 trillion parameters. A staggering growth in such a short period of time and one that relies on the continued evolution of the memory technologies used to process these massive amounts of data.
As AI applications evolve and become more complex, more advanced models, larger data sets and massive data processing needs require lower latency, higher bandwidth memory, as well as increased storage and more powerful CPU computing capabilities. Let’s now take a look at the memory technologies making AI happen.
HBM3 and GDDR6 are two memory technologies crucial for supporting the development of AI training and inference. HBM3, based on a high-performance 2.5D/3D memory architecture, offers high bandwidth and low power consumption for data transmission between memory and processing units. HBM3 also offers outstanding latency and compact footprint, making it a superior choice for AI training hardware in the heart of the data center.
GDDR6 is a high-performance memory technology that offers high bandwidth, low latency, and is less complex to implement than HBM3. The excellent price-performance of GDDR6 memory, built on time-tested manufacturing processes, makes it a great choice for AI inference applications particularly as they move to the edge and into smart endpoints.
Another critical technology enabling AI is server CPU main memory. CPUs are used for system management, as well as accessing and transforming data to be fed to the training accelerators, serving a key role in keeping the demanding training pipeline filled. DDR5 provides higher data transmission rates and greater capacity than previous generation DDR4, supporting faster and more efficient data processing. DDR5 4800 MT/s DRAM is used in the latest generation of server CPUs and will scale to 8000 MT/s and above to serve many future generations.
Linked to server main memory is Compute Express Link (CXL), an open standard cache-coherent interconnect between processors, accelerators, and memory devices. Promising features like memory pooling and switching, CXL will enable the deployment of new memory tiers that can bridge the latency gap between main memory and SSD storage. These new memory tiers will add bandwidth, capacity, increased efficiency, and lower total cost of ownership (TCO), all critical elements for AI applications
These are some of the key memory technologies that the industry will rely on to take AI applications performance to even greater levels in the future. Last month, Rambus Fellow Dr. Steve Woo hosted a panel at the AI Hardware & Edge AI Summit on the topic of “Memory Challenges for Next-Generation AI/ML Computing.” If you’re interested in reading more about some of the challenges and opportunities facing the memory industry when it comes to AI, check out his blog recap of the discussion at AI Hardware Summit.
Rambus DDR5 memory interface chips, memory interface IP, and interconnect IP all provide the speed and capacity required for demanding AI workloads now and in the future. With a broad security IP portfolio, Rambus also enables cutting-edge security for hardware-based AI accelerators. As the industry continues to evolve, Rambus expertise in memory interface chips, and interface and security IP solutions, can contribute greatly to the evolution of high-performance hardware for demanding AI workloads.
Leave a Reply