How Is Your HBM Memory?

Getting to the cloud means a greater reliance on memory.

popularity

The seemingly countless applications used every day requiring web access (social media, streaming video, games, etc.) are not only driving the need to store a tremendous amount of data, but also driving the need to access this data with as little delay as possible. Add to this list the growing number of connected devices (IoT), and you can see why changes in the data center are needed, in particular in the way memory is accessed.

Getting to the cloud means greater reliance on memory. Servers are typically configured with DRAM (DIMMs) for low-latency memory access; flash storage with slightly higher latency; and finally hard-disc drive (HDD) memory with the slowest access but lowest cost per bit (for storage of pictures and videos). Last summer, companies began to announce that they were upgrading DIMM memory from DDR3 to DDR4 memory. Upgrading to DDR4 memory improved both bandwidth — up to 3.2Gbps max speed — and reduced power by about 25 percent, which are big concerns for data centers.

The move to DDR4 is certainly helping but server and SoC manufactures are also looking at alternative memory architectures to improve bandwidth, reduce latency and power consumption. Given these requirements, there has been an increased interest in high-bandwidth memory (HBM): a JEDEC standard for a stacked DRAM memory solution with a very wide data bit interface. HBM was originally introduced as a graphic memory replacement, but the advantages as an ‘additional tier’ of memory in the server memory architecture is becoming apparent.

HBM DRAM increases the memory bandwidth by providing a very wide interface to the SoC of 1024 bits.  The maximum speed for HBM2 is 2Gbits/s for a total bandwidth of 256Gbytes/s. Although the bit rate is similar to DDR3 at 2.1Gbps, which accounts for the bulk of memory used in DIMMs today, the eight 128-bit channels give HBM about 15 times more bandwidth.

Another advantage to HBM is lower power per bit.  To implement HBM requires 2.5D technology. This can be done by connecting the HBM to an SOC via a silicon, or organic interposer. Having a very short and controlled channel between the memory and the SOC requires less drive from the memory interface, thus reducing the power when compared to DIMM interfaces.  In addition, since the interface is wide, you can achieve very high bandwidth with a slower frequency (2Gbps as previously mentioned) also contributing to the power saving.
 
It’s important to understand that HBM implementations can also have their challenges. Certainly, 2.5D technology adds manufacturing complexities, plus the silicon interposer adds cost. There are a lot of expensive components being mounted to the interposer, including the SOC and multiple HBM die stacks, so good yield is very critical to make the system cost effective. There is also the challenge of routing thousands of signals (data + control + power/ground) via the interposer to the SOC for each HBM memory used.
 
Even with these challenges, the advantage of having for example four HBM memory stacks, each with 256Gbyts/s, very close to the CPU provides both a significant increase in memory density (up to 8Bb per HBM), and bandwidth when compared with existing architectures. 

How is your server memory?