GDDR6 Memory For Life On The Edge

Combining real-time interactivity and AI/ML inferencing for ultra-low latency applications like cloud gaming.

popularity

With the torrid growth in data traffic, it is unsurprising that the number of hyperscale data centers has grown apace. According to analysts at the Synergy Research Group, in July of this year there were 541 hyperscale data centers worldwide. That represents a doubling in the number since 2015. Even more striking, there are an additional 176 in the pipeline, so the breakneck growth in hyperscale data centers continues unabated.

A number of factors beyond raw data traffic is driving growth at the core of the network. Workloads like AI/ML training are absolutely voracious in their demand for data and bandwidth. AI/ML training is growing at 10X annually with the largest training models surpassing 10 billion parameters in 2019 and blowing through the 100 billion mark this year. Further, there is the ongoing megatrend of business applications moving from on-premise enterprise data centers to the cloud.

It’s far harder to track because of the multitude of applications and implementations, but while action at the network core is white hot, there’s arguably even more going on at the edge. IDC predicts that by 2023, edge networks will represent 60% of all deployed cloud infrastructure. While the applications are many, underlying all is one important factor: “latency.”

Reflecting on how fast we transitioned from entertainment on disc to the world of streaming, it is absolutely amazing that we now routinely stream 4K TV and movies to our displays both large and small. But that technical achievement is child’s play compared to making cloud (streaming) gaming work at scale. Streaming games demand that the delay between when a player inputs an action and when that is manifested on their screen (“finger-to-photon” time) is imperceptible.

The opportunity for companies rolling out cloud gaming services is tapping into the nearly one billion people worldwide playing online games. But unlike a traditional online game that employs local hardware to run the game while exchanging a relatively light batch of data with a gaming server, with streaming everything runs in the cloud. The client can be any display with a network connection and an input device.

To make it work, the gaming service providers run the games on enterprise class servers with high-end graphics cards, run them at a higher frame rates (to deliver 60 fps to the player), use highly efficient video encoders, and you guessed it, use AI/ML inferencing. Specifically, AI/ML analyzes the network path from cloud to player and back in real time, and then adjusts the network as necessary to maintain quality of service.

Streaming games is just one of many applications that will combine real-time interactivity (require ultra-low latency) and AI/ML inferencing. In fact, AI/ML inferencing will be increasingly ubiquitous across industries and applications. The impact is more computing power moving closer to the user, and this fuels the growth in edge infrastructure as predicted by IDC.

Of course, the evolution of inferencing models running at the edge will parallel the rapid growth in AI/ML training, requiring increasingly powerful processing and more memory bandwidth and capacity. But given the broad breadth of edge deployments, solutions must meet the necessary performance requirements while doing so at an economical price point. GDDR6 memory fits the bill well.

GDDR6 memory delivers over 2.5X the per device bandwidth of the fastest LPDDR5 or DDR4 memories. At Rambus, we’ve demonstrated in silicon GDDR6 operation to 18 Gbps data rate which translates to 72 GB/s of bandwidth per DRAM over a 32-bit wide interface. Building on a manufacturing base providing memory for tens of millions of graphic cards per quarter, and using time-tested production methods, GDDR6 delivers best-in-class performance at a very competitive price point.

The key to unlocking the performance of GDDR6 memory is mastering the signal and power integrity (SI/PI) challenges of operation at very high data rates. Rambus helps SoC designers tackle this challenge. With over 30 years of leadership in high speed signaling, we’ve literally written the book on the best practices for high-speed SI/PI design.

The Rambus GDDR6 interface solution benefits from this extensive design history. Further, our solution is silicon-proven and consists of an integrated and verified PHY and digital controller. We back this up with PCB and package design support as well as reference designs to aid customers with implementation in SoCs and accelerating time to market.

With the rapid advancements in AI/ML, life on the edge is getting very interesting. The demands for computing power and memory bandwidth are headed up at a rapid clip. With GDDR6 memory, you get a solution that delivers breakthrough performance at the right price for the next wave of edge AI/ML inferencing designs.

Additional Resources:
White Paper: From Data Center to End Device: AI/ML Inferencing with GDDR6
White Paper: HBM2E and GDDR6: Memory Solutions for AI
Webinar: GDDR6 and HBM2E Memory Solutions for AI
Website: Rambus GDDR6 PHY and Rambus GDDR6 Controller
Product Briefs: GDDR6 PHY and GDDR6 Controller
Solution Brief: Rambus GDDR6 Interface



Leave a Reply


(Note: This name will be displayed publicly)