Memory For AI At The Edge


Inferencing at the edge has very different needs than training large language models or large-scale inferencing in AI data centers. Many edge devices run on a battery. They're price-sensitive, and they are constrained by the physical area of the device. As a result, the amount of memory that can be packed into these devices is also limited. Steve Woo, Rambus fellow and distinguished inventor, t... » read more

AI Inference Needs A Mix-And-Match Memory Strategy


AI inference is no longer a single workload that can be served efficiently by a single type of accelerator or memory. From fast chat replies to 10M token codebases, inference spans wildly diverse workloads with very different limits on latency, bandwidth, capacity, and compute, as the figure below demonstrates.1 Source: Meta1 The AI inference spectrum of workloads includes: Inter... » read more

GDDR7 Momentum Accelerates As A Key Solution For AI Inference


The AI hardware landscape continues to evolve at a breakneck speed, and memory technology is rapidly becoming a defining differentiator for the next generation of GPUs and AI inference accelerators. When NVIDIA introduced Rubin CPX, its new class of GPU tailored for massive context inference, it underscored a new industry reality: memory throughput and efficiency are now just as critical as ra... » read more

HBM Leads The Way To Defect-Free Bumps


High-bandwidth memory stands at the forefront of multiple technology developments as a critical enabler of AI, but it is one of the most difficult modules to manufacture. Leading HBM device makers and foundries must simultaneously handle multi-layer chip stacking, die warpage, and shorter product lifecycles that are shrinking from two years down to just one. But perhaps the most formidable c... » read more

LPDDR6: Not Just For Mobile Anymore


LPDDR memory has been almost synonymous with mobile devices, but starting with the new LPDDR6 specification released in July 2025 by JEDEC, it will begin showing up inside of data centers, as well, early next year. The key factors in various flavors of DRAM are bandwidth, capacity, and cost. HBM is the fastest, but it's also expensive, and it requires a 2.5D or 3.5D packaging approach. GDDR is ... » read more

GDDR7 Tackles Massive-Context AI Inference


The AI hardware landscape is evolving at breakneck speed, and memory technology is at the heart of this transformation. NVIDIA’s recent announcement of Rubin CPX, a new class of GPU purpose-built for massive-context inference, underscores this trend. Rubin CPX is designed to tackle workloads that require reasoning across millions of tokens. Use cases include long-form generative video, comple... » read more

The Evolution of DRAM


DRAM has been around since 1966, but today it's still the same basic 1T 1C bit cell architecture. Yet changes are coming as DRAM is called upon to store and retrieve more data faster. Steve Woo, distinguished inventor and fellow at Rambus, talks about how DRAM works, why there are different flavors, the impact of cooling new solutions in denser configurations, and ongoing issues involving the s... » read more

The Best DRAMs For Artificial Intelligence


Artificial intelligence (AI) involves intense computing and tons of data. The computing may be performed by CPUs, GPUs, or dedicated accelerators, and while the data travels through DRAM on its way to the processor, the best DRAM type for this purpose depends on the type of system that is performing the training or inference. The memory challenge facing engineering teams today is how to keep... » read more

Choosing The Right Memory Solution For AI Accelerators


To meet the increasing demands of AI workloads, memory solutions must deliver ever-increasing performance in bandwidth, capacity, and efficiency. From the training of massive large language models (LLMs) to efficient inference on endpoint devices, choosing the right memory technology is critical for chip designers. This blog explores three leading memory solutions—HBM, LPDDR, and GDDR—and t... » read more

GDDR7 Memory Supercharges AI Inference


GDDR7 is the state-of-the-art graphics memory solution with a performance roadmap of up to 48 Gigatransfers per second (GT/s) and memory throughput of 192 GB/s per GDDR7 memory device. The next generation of GPUs and accelerators for AI inference will use GDDR7 memory to provide the memory bandwidth needed for these demanding workloads. AI is two applications: training and inference. With tr... » read more

← Older posts