Redefining XPU Memory For AI Data Centers Through Custom HBM4: Part 1


This is the first of a three-part series on HBM4 and gives an overview of the HBM standard. Part 2 will provide insights on HBM implementation challenges, and part 3 will introduce the concept of a custom HBM implementation. Relentless growth in data consumption Recent advances in deep learning have had a transformative effect on artificial intelligence (AI) and the ever-increasing volume of ... » read more

UMI: Extending Chiplet Interconnect Standards To Deal With The Memory Wall


With the Open Compute Project (OCP) Summit upon us, it’s an appropriate time to talk about chiplet interconnect (in fact the 2024 OCP Summit has a whole day dedicated to the multi-die topic, on October 17). Of particular interest is the Bunch of Wires (BoW) interconnect specification that continues to evolve. At OCP there will be an update and working group looking at version 2.1 of BoW. (... » read more

DRAM Cache for GPUs With SCM And High Bandwidth


A new technical paper titled "Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory" was published by researchers at POSTECH and Songsil University. Abstract "We propose overcoming the memory capacity limitation of GPUs with high-capacity Storage-Class Memory (SCM) and DRAM cache. By significantly increasing the memory capacity with SCM, the GPU can capture a larger fraction o... » read more

Designing AI Hardware To Deal With Increasingly Challenging Memory Wall (UC Berkeley)


A new technical paper titled "AI and Memory Wall" was published by researchers at UC Berkeley, ICSI, and LBNL. Abstract "The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs. However, the main performance bottleneck is increasingly shifting to memo... » read more

Balancing Memory And Coherence: Navigating Modern Chip Architectures


In the intricate world of modern chip architectures, the "memory wall" – the limitations posed by external DRAM accesses on performance and power consumption growing slower than the ability to compute data – has emerged as a pivotal challenge. Architects must strike a delicate balance between leveraging local data reuse and managing external memory accesses. While caches are critical for op... » read more

CNN Hardware Architecture With Weights Generator Module That Alleviates Impact Of The Memory Wall


A technical paper titled “Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation” was published by researchers at Samsung AI Center and University of Cambridge. Abstract: "The unprecedented accuracy of convolutional neural networks (CNNs) across a broad range of AI tasks has led to their widespread deployment in mobile and embedded settings. In a pursuit for high... » read more

Moving Data And Computing Closer Together


The speed of processors has increased to the point where they often are no longer the performance bottleneck for many systems. It's now about data access. Moving data around costs both time and power, and developers are looking for ways to reduce the distances that data has to move. That means bringing data and memory nearer to each other. “Hard drives didn't have enough data flow to cr... » read more

What’s Next For High Bandwidth Memory


A surge in data is driving the need for new IC package types with more and faster memory in high-end systems. But there are a multitude of challenges on the memory, packaging and other fronts. In systems, for example, data moves back and forth between the processor and DRAM, which is the main memory for most chips. But at times this exchange causes latency and power consumption, sometimes re... » read more

What’s Next In Advanced Packaging


Packaging houses are readying the next wave of advanced IC packages, hoping to gain a bigger foothold in the race to develop next-generation chip designs. At a recent event, ASE, Leti/STMicroelectronics, TSMC and others described some of their new and advanced IC packaging technologies, which involve various product categories, such as 2.5D, 3D and fan-out. Some new packaging technologies ar... » read more

What’s the Right Path For Scaling?


The growing challenges of traditional chip scaling at advanced nodes are prompting the industry to take a harder look at different options for future devices. Scaling is still on the list, with the industry laying plans for 5nm and beyond. But less conventional approaches are becoming more viable and gaining traction, as well, including advanced packaging and in-memory computing. Some option... » read more

← Older posts