DRAM Remains The Status Quo

Latency will remain a serious memory challenge for at least the next couple years.


By Frank Ferro
No one will argue that the “post-PC” era is here. Tablet shipments are expected to pass laptops by the end of this year, and desktops by the end of 2015. Add-in the nearly 1 billion smartphones shipment projected for 2013, and you would think that the DRAM industry would take notice of this volume.

DRAM manufacturers do care about this segment of the market, but this fact is not obvious when looking at their roadmaps. The reality is that DRAM specifications continues to be driven by the PC and server markets, and it does not look it will change anytime soon. This was one of the key takeaway items for me after attending MemCon in Santa Clara two weeks ago.

Although unit shipments for mobile devices are higher than PCs, they only represent about 13% of the overall memory shipments said Martin Lund, senior vice president at Cadence, in his keynote address. This number would justify maintaining the current DRAM roadmap—at least for now. The problem, although not a new one, is that other mobile and embedded products have to live within the memory constraints set by the PC industry, which is focused on ‘cost per bit.’

Making Lemonade. Reducing the cost per bit is a good thing if you are making a PC or a server, but many embedded designers would put lower latency and lower power consumption much higher on their wish list. In addition, embedded designs typically need less memory capacity. However, for cost reasons, designers must choose a DRAM based on the lowest price node, regardless of the memory capacity needed for their design. Power consumption is being address to some degree as the adoption low-power DRAM (LPDDR) is increasing, but low power is still not a fundamental design criterion for DRAM. Latency, however, is a particular problem. CPU processing efficiency is greatly reduced because the processor has to wait for DRAM. And according to Bob Brennan, senior vice president at Samsung Semiconductor, DRAM latency has remained constant over the last 10 years. Increasing the processor speed, or adding a second processor, will not help if the CPUs are spending most of their time waiting for memory.

To address latency, we continue to see cache sizes growing, along with an increase in the number of caches (L3 and L4). More cache memory reduces latency, but at the expense of larger die sizes, and increased design complexity (schedule and cost). In addition to the cache, customers I work with are looking for latency reduction in the on-chip network (it seems now more than ever). Fast and efficient connections from the CPU to DRAM are critical, and the problem only gets worse with competition for DRAM from other heterogeneous processors. Parallel connections (sending address and data simultaneously) offer the lowest latency (zero in theory), while serial connections (address, then data) offer higher speeds with reduced wire count while introducing some latency. In addition to the connection topology between the CPU and DRAM, the network needs to have an advanced QoS algorithm to ensure that CPU traffic is not stalled or blocked entirely from memory due to data from other processors or I/O.

Feels like a Band-Aid. Another solution that addresses the need for more efficient DRAM in embedded mobile products is Wide I/O. Wide I/O 2 offers very efficient power consumption for a given bandwidth. The challenge with Wide I/O, which was reiterated by many MemCon speakers and during the panel discussion, is the manufacturing cost, reliability and the business model. For this technology to be widely adopted (no pun intended), their needs to be a major company willing to take the lead in order to drive volume and reliability up, and manufacturing cost down. With the consolidation in the applications processor market, it is not clear if and when a driver emerge. I was somewhat surprised to hear that most panelists believe that the hybrid memory cube (HMC) will be in volume production before Wide I/O. HMC is also a stacked die solution with a logic layer offering better bandwidth, lower power, and lower latency. As a stacked die TSV (through-silicon-via) solution, HMC will certainly have many of the cost and manufacturing challenges facing wide I/O.

The bottom line is that for the next two years (at least), architecting your SoC around the current DDR and LPDDR roadmap is the only practical choice. LPDDR3/4 provides a good bandwidth/power node, allowing for incremental improvements in SoC power consumption and performance, while avoiding some of the manufacturing risks. For the DRAM industry to start focusing on memory architectures optimized for latency and power consumption, we will have to wait for the embedded markets to clearly overtake the PC and server in memory consumption. Until then, the focus must be on system architectures that improve data flow efficiency in the SoC by using optimized network connections between key processing cores and memory, along with advanced QoS algorithms to manage traffic flow to maximize DRAM utilization.

—Frank Ferro is director of product marketing at Sonics.