SPONSOR BLOG

HBM3E: All About Bandwidth

LLM training is driving the need for memory subsystems that support high data rates.

August 8th, 2024 - By: Frank Ferro

The rapid rise in size and sophistication of AI/ML training models requires increasingly powerful hardware deployed in the data center and at the network edge. This growth in complexity and data stresses the existing infrastructure, driving the need for new and innovative processor architectures and associated memory subsystems. For example, even GPT-3 at 175 billion parameters is stressing the bandwidth, capacity, training time, and power of the most advanced GPUs on the market.

To this end, Cadence has shown our HBM3E memory subsystem running at 12.4Gbps at nominal voltages, demonstrating the PHY’s robustness and performance margin. The production version of our latest HBM3E PHY supports DRAM speeds of up to 10.4Gbps or 1.33TB/s per DRAM device. This speed represents a >1.6X bandwidth increase over the previous generation, making it ideal for LLM training.

Recall that HBM3E is a 3D stacked DRAM with 1024-bit wide data (16 64-bit channels). While this wide data bus enables high data transfer, routing these signals requires interposer technology (2.5D) capable of routing close to 2000 signals (data and control), including silicon, RDL, and silicon bridges.

The interposer design is critical for the system to operate at these data rates. Cadence provides 2.5D reference designs, including the interposer and package, as part of our standard IP package. As demonstrated in our test silicon, these designs give customers confidence they will meet their memory bandwidth requirements. The reference design is also a good starting point, helping to reduce development time and risk. Our expert SI/PI and system engineers work closely with customers to analyze their channels to ensure the best system performance.

Even as HBM3E delivers the highest memory bandwidth today, the industry keeps pushing forward. JEDEC recently announced that HBM4, the next version of the HBM DRAM standard, is nearing completion. JEDEC calls HBM4 an “evolutionary step beyond the currently published HBM3 standard.” They also claim HBM4 “enhancements are vital for applications that require efficient handling of large datasets and complex calculations.” HBM4 will support AI training applications, high-performance computing (HPC), and high-end graphics cards.

Frank Ferro

(all posts)
Frank Ferro is group director for product management at Cadence.

HBM3E: All About Bandwidth

Frank Ferro

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

3D-IC For The Masses

Chiplets Add New Power Issues

Development Flows For Chiplets

Chiplet Tradeoffs And Limitations

New Data Center Protocols Tackle AI

Implementing AI Activation Functions

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

HBM3E: All About Bandwidth

Frank Ferro

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

3D-IC For The Masses

Chiplets Add New Power Issues

Development Flows For Chiplets

Chiplet Tradeoffs And Limitations

New Data Center Protocols Tackle AI

Implementing AI Activation Functions

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored