China GenAI: Who Will Fill The Vacuum?


China and the U.S.A are locked in a titanic battle over tariffs. The U.S. makes the world’s best AI Accelerators: Nvidia, AMD, Google, AWS …among others. But the U.S. worries China could deploy these for military purposes, so it imposed strict export controls that resulted in China getting the second-best AI accelerators. These export controls have been further tightened as part of tarif... » read more

Algorithm–HW Co-Design Framework for Accelerating Attention in Large-Context Scenarios (Cornell)


A new technical paper titled "LongSight: Compute-Enabled Memory to Accelerate Large-Context LLMs via Sparse Attention" was published by researchers at Cornell University. Abstract "Large input context windows in transformer-based LLMs help minimize hallucinations and improve output accuracy and personalization. However, as the context window grows, the attention phase increasingly dominates... » read more

LPDDR6: Not Just For Mobile Anymore


LPDDR memory has been almost synonymous with mobile devices, but starting with the new LPDDR6 specification released in July 2025 by JEDEC, it will begin showing up inside of data centers, as well, early next year. The key factors in various flavors of DRAM are bandwidth, capacity, and cost. HBM is the fastest, but it's also expensive, and it requires a 2.5D or 3.5D packaging approach. GDDR is ... » read more

Critical Factors For Storing Data In DRAM


DRAM is becoming more complicated to develop, and more difficult to manage inside AI data centers. In the past, latency, bandwidth, and capacity were the primary considerations. But as the amount of data that needs to be processed, moved, and stored continues to rise, a whole new set of factors is emerging. Steven Woo, fellow and distinguished inventor at Rambus, talks about latency under load,... » read more

3D Stacked HBM and Accelerators for LLMs: Heat Management and PDN (Georgia Tech, SK Hynix)


A new technical paper titled "3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency" was published by a researcher from Georgia Institute of Technology and SK Hynix. Abstract "Advanced packaging is becoming essential for designing hardware accelerators for large language models (LLMs). Different architectures such as 2.5D integration of... » read more

Heterogeneous System With Specialized HW For Disaggregated LLM Inference (Princeton Univ., Univ. of Washington)


A new technical paper titled "SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference" was published by researchers at Princeton University and University of Washington. Abstract "Large Language Models (LLMs) have gained popularity in recent years, driving up the demand for inference. LLM inference is composed of two phases with distinct characteristics: a compute-boun... » read more

Interconnect Innovations In High Bandwidth Memory: Part 2


By Damon Tsai, Woo Young Han, and Tim Kryman Interconnect technology in high bandwidth memory (HBM) is at a fork in the road. One direction leads to tried-and-true microbump technology, and the other leads to a compelling alternative, hybrid bonding. Both technologies are evolving to address the stringent requirements of next generation HBM in pursuit of increased I/O density supporting high... » read more

On-Package Memory With UCIe To Improve Bandwidth Density And Power Efficiency (AMD, Intel Corp.)


A new technical paper titled "On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach" was published by researchers at Intel Corporation and AMD. Abstract "Emerging computing applications such as Artificial Intelligence (AI) are facing a memory wall with existing on-package memory solutions that are unable to meet ... » read more

HBM4 Memory: Break Through to Greater Bandwidth


Delivering unrivaled memory bandwidth in a compact, high-capacity footprint, has made HBM the memory of choice for AI training. HBM4 is the fourth major generation of the HBM standard, with new power management and RAS features. The Rambus HBM4 Controller provides industry-leading performance to 10.0 Gb/s, enabling a memory throughput of over 2.5 TB/s for training systems, generative AI and oth... » read more

3D-Stacked HBM Architecture Susceptibility To Thermal Attacks (NC A&T State, New Mexico State)


A new technical paper titled "On the Thermal Vulnerability of 3D-Stacked High-Bandwidth Memory Architectures" was published by researchers at North Carolina A&T State University and New Mexico State University. Abstract "3D-stacked High Bandwidth Memory (HBM) architectures provide high-performance memory interactions to address the well-known performance challenge, namely the memory wal... » read more

← Older posts Newer posts →