HBM2e Offers Solid Path For AI Accelerators

AI processor performance is rapidly growing, making memory architecture choice more important.

popularity

Today, AI processors are so blazingly fast that they’re constantly having to wait for data from memory. Unfortunately, with the status quo, memory is just not fast enough to unleash the true performance of those new and highly advancing AI processors. In simple terms, AI processor performance is rapidly growing, and memory is not keeping up. This creates a bottleneck, or what Rambus calls the “AI Memory Gap” that needs to be addressed.

Industry experts are scratching their heads and asking the same question: What are the memory architectures most suitable for AI? The answer is not simple. Tech leaders are looking at many and varied paths to take, in part because no one knows the right answer. But it’s a vital enough issue that industry memory experts are searching those many paths.

The AI accelerators and memory issues include high bandwidth, access time, and memory energy consumption on AI processor chips. In training or inference AI applications, convolutional neural networks (CNNs) or deep neural networks (DNNs) have a processing engine connecting to memory. When data comes to the deep neural network (DNN) engine, a read is fetched from memory; the DNN engine computes the information, and then it writes it back to memory. Memory is stored with key weights, co-efficients and parameters. Without question, memory plays an extremely key role in the AI process.

Bandwidth, access time, energy consumption, capacity, and performance – all are affected if memory is on-chip or if it’s off-chip and far away from the AI processor. If it’s on chip, then the AI processor can interact with it and process the data extremely fast. But when memory is off chip, it takes two to six times more time to access the memory.

At the same time, it consumes considerable power or energy on the chip. That’s why it’s critical to have an efficient memory solution with highly acceptable tradeoffs involving cost, power, performance, and area with respect to the DNN engine in AI chips today.

There are different types of memories on the market, conventional double-date rate (DDR) and some of the more exotic varieties, like resistive RAM (ReRAM) and magnetoresistive RAM (MRAM). These and others have different value propositions, pros and cons.

However, to make up this memory slack in the short term, there are a couple of contenders talked about in recent times — GDDR6 and high bandwidth memory generation 2 enhanced (HBM2e). Certainly, GDDR6 is ideally suited for a number of key applications, including networking, graphics, automotive ADAS, and some AI-related ones.

However, HBM2e brings more to the AI party, and especially to AI processors. HBM2e provides a much better total cost of ownership, performance, power, and area (PPA), and time to market/time to revenue.

For starters, for higher performance, higher bandwidth, HBM2e has considerably better access times compared to DDR4, DDR5, and even GDDR6 because it is directly connected die to die in 2.5D packaging, see Figure 1.


Figure 1 – HBM2e connected die to die with SoC or ASIC in 2.5D packaging.

What about area? Traditional DRAMs are off chip, meaning they require a considerable number of I/Os. Chip size and periphery increase as well since it needs more board space. On the other hand, HBM2 features die to die connectivity in a single package. Hence, I/O requirement is substantially reduced, although HBM is a wide interconnect to the die to die. But HBM2 offers up board space reduction, smaller size, and area benefits.

Now, let’s talk about power. As stated earlier, when data goes from the DNN engine to memory, the data path goes all the way from the DNN processor all the way outside the chip. That consumes considerable power or energy on the chip in the case of GDDR6.

But with HBM2e, since it’s a die to die connectivity and closer to the processor, power consumption is considerably reduced. Overall, HBM gives you the more acceptable power consumption, performance, and area compared to other memories.

The industry is now at a time to make further AI inroads, thanks to HBM2e. The current HBM generation is widely available from leading chipmakers like Samsung and SK Hynix. Moreover, it meets total cost of ownership (TCO) and time to market/time to revenue requisites that are critical in the overall AI application success formula.



Leave a Reply


(Note: This name will be displayed publicly)