High-bandwidth memory is gaining significant traction, but poses unique challenges for PHY, chip and subsystem design.
HBM DRAM is currently used in graphics, high-performance computing (HPC), server, networking and client applications. HBM, says JEDEC HBM Task Group Chairman Barry Wagner, provides a “compelling solution” to reduce the IO power and memory footprint for the most demanding applications. Recent examples of second-generation HBM deployment include NVIDIA’s Quadro GP100 GPU which is paired with 16GB of stacked (through-silicon via) ECC HBM2 (High Bandwidth Memory) for 720GB/s of bandwidth and Intel’s new Lake Crest CPU which packs 32GB of HBM2.
As system designers look to move higher bandwidth closer to the CPU, HBM2 offers an opportunity to significantly expand memory capacity and maximize local DRAM storage for wider throughput in the data center. Indeed, HBM DRAM architecture increases system memory bandwidth by providing a wide interface to the SoC of 1024 bits. More specifically, second-gen HBM offers a maximum speed of 2Gbits/s, or a total bandwidth of 256Gbytes/s. Although the bit rate is similar to DDR3 at 2.1Gbps, the eight 128-bit channels provide HBM with approximately 15x more bandwidth.
HBM2 architecture presents engineers with several unique PHY, chip and subsystem design challenges. For example, as we noted above, HBM memory is 1,024 bits wide. However, when power, ground and other necessary signaling techniques are taken into account, the interconnect count is closer to 1700. Driving high signal counts affects a wide range of considerations, including power and ground distribution, signal integrity and cross-talk interference.
Interposer designs – which connect HBM modules to the SoC – are also quite challenging to implement, as signal length needs to be minimized to reduce drive strength requirements and power consumption for the PHY. Similarly, cross-talk should be meticulously analyzed and mitigated, while impedance and ground return paths must be carefully considered to maintain signal integrity and meet timing and eye margin requirements for the HBM PHY interconnect.
Despite these challenges, an HBM2 system can be effectively designed with robust timing, temperature and voltage margins for high-volume production. However, building a robust, dependable PHY requires careful interposer layout, extensive chip and package simulation and analysis, as well as solid power distribution. The interposer can be designed with a reasonable number of signal layers when utilizing small but achievable signal widths, and further analysis shows that side-guard grounding signals are not required. Adequate eye margins can be achieved when careful routing keeps the signal lengths reasonable and multiple signal layer transitions are avoided.
In conclusion, the implementation of 2.5D technology in HBM2 systems adds numerous manufacturing complexities for engineering teams. Nevertheless, HBM continues to gain significant traction in the graphics, server and networking markets due to the real-world benefits of placing increased memory bandwidth and density closer to the CPU.