Tricky Tradeoffs For LPDDR5

New memory offers better performance, but that still doesn’t make the choice simple.

popularity

LPDDR5 is slated as the next-gen memory for AI technology, autonomous driving, 5G networks, advanced displays, and leading-edge camera applications, and it is expected to compete with GDDR6 for these applications. But like all next-gen applications, balancing power, performance, and area concerns against new technology options is not straightforward.

These are interesting times in the memory industry. There is a new standard on the low power side with LPDDR5, and there is soon to be new standard in the standard DDR category, as well, with DDR5.

The DDR spectrum consists of three major categories:

  • Standard DDR, which is mainly for the enterprise market. DDR4 is the most popular entry in this category.
  • LPDDR for mobile, where the LPDDR4 and LPDDR4x are the popular standards. LPDDR5 was released earlier this year.
  • GDDR and HBM for graphics and high-performance applications.

On the LPDDR front, typically smartphone applications and related applications that have being using the LPDDR4 and 4x standards for high performance.

“These provide a lot of opportunities for the SoC to take advantage of the low power features that these DRAMs provide,” said Vadhiraj Sankaranarayanan, technical marketing manager for interface memory at Synopsys. “If there is a frequency change that is required during runtime, or if the DRAMs have to be placed on the deep loop of a state’s exit — especially when the system is idle — the SoC can bring the power down because most of the applications that use LPDDR memories are very, very sensitive to power.”

Every standard tries to outperform all the predecessors in that category, and LPDDR5 is no exception. “LPDDR4 and 4x run up to 40 to 67 Mbps,” Sankaranarayanan said. With the newer applications coming, the existing ones require a higher memory bandwidth because the number of cores in the CPU is increasing. So it’s quite common to expect that the memory bandwidth requirements will keep getting higher and higher. LPDDR5 tries to achieve that by letting the data run at the higher data rate up to 6,400 Mbps. This is quite a bit of a jump, as compared to the maximum speed of today’s popular standards, LPDDR4 and 4x.

But just because the data rate is higher doesn’t mean a design should automatically migrate there.

“LPDDR5 is talking about 5Gbps-plus range for the LP, so it’s giving quite a bit more bandwidth than DDR4,” said Frank Ferro, vice president of marketing at Rambus. “But, it’s still slower than GDDR6. So then it becomes a tradeoff of power and cost. If you can get away with a system that’s running, say, with the DRAM that’s in the 5- to 6-Gbps range, then LPDDR5 is a good solution. If that still means you’ve got to put down 5, 6 or 7 of these on a board versus putting down two GDDRs, that’s the tradeoff that has to be made.”

The timing of when to migrate memory to the next version of a standard is almost always about running out of bandwidth and then staying in the power envelope, rather than needing lower power, said Ferro. “That’s the challenge we’re often given. [An engineering team] will say, ‘I can only burn this much power.’ For example, if you have an AI card that contains at a 75-watt PCIe card, that’s the power budget; the design team must find a memory system to fit in that budget. You have to balance 50% of the budget that’s going to go to the CPU, while the other 50% of that goes to the memory subsystem. So whatever memory fits in that box, whether it’s LPDDR4, whether it’s GDDR6, whether it’s HBM — in addition to the cost — that’s the memory that’s chosen.”

Design teams often stay one additional generation with the existing memory because they’ve managed to optimize the system a bit more. But here, it’s still a balancing act in terms when to change, and most times that comes down to performance, although occasionally it’s for density benefits, Ferro said.

Besides the ubiquitous smartphone applications, there are other emerging applications such as artificial intelligence applications, and automotive applications that find these LPDDR memories highly attractive because of the performance that these memories provide to these applications, Sankaranarayanan said, in addition to the low power features that these applications provide.

LPDDR5, besides just running at a higher data rate, also allows room for density expansion. LPDDR5 DRAMs can support up to 32 Gbits per channel whereas LPDDR4 as DRAMs could support up to 16 gigabits per channel. So, that is the capacity expansion that is possible with LPDDR5s as DRAMs. Similarly, in addition to the higher speed, there are a number of features that LPDDR5 introduces.

With LPDDR5, SoC timing closure becomes simpler, because the controller would be working at a lower frequency. Additionally, there are quite a few features in LPDDR5 when it comes to the reliability of the channel and how to make the LPDDR5 channel more robust.

“LPDDR5 introduces a lot of features to increase and enhance the reliability on the channel by supporting equalization decision feedback, and equalization in the DRAMs by introducing link error correcting code records, by offering protection of any single materials operating on the channel,” said Sankaranarayanan. “There’s also a low power feature for dynamic voltage and frequency scaling. LPDDR4 and 4x allow dynamic frequency scaling up to two, because the DRAMs could store the settings for up to two frequency set points. LPDDR5 enhances that to three set points. It also brings in the concept of dynamic voltage scaling, by letting the DRAMs operate at 0.5 volt I/O voltage, or 0.5 volt when the DRAMs support a higher frequency, higher data rates, and when of course, the controller wants to float out the DRAMs because of necessity to save the power. Or if the product is light, then you can bring down the frequency states or the data rates. Then the diverted voltage also can be scaled down to 0.3 volts.”

When designing with LPDDR5, some of the challenges relate to the higher speed when it comes to the PHY and the controller. “It’s a new protocol, so as far as the controller is concerned,” he said. “It has to support the new command set, the new timing parameters, etc. As far as the PHY is concerned, it’s a lot higher speed. So from the design point of view, it is getting more and more challenging just as with supporting a newer protocol.”

Embedded options
Depending upon the application, there is the option for embedded the memory, as well.

“When you’re dealing with memory, there is so much you have to partition,” said Farzad Zarrinfar, managing director of the IP Division at Mentor, a Siemens Business. “You can put more memory inside your chip. That gives you best situation from a bandwidth point of view, as well from a power consumption point of view, because you don’t need to drive a high-capacitive load to the outside world. Also, the cost is lower because you don’t need the high pin count communication to the outside world.”

Once again, there are tradeoffs. Internal memory needs to use the process technology for the rest of the chip. External memory can use the process optimized for the memory. That makes partitioning critical.

“Optimum power is the number one priority [for embedded memory] more than 70% to 80% of the time,” said Zarrinfar. “At the same time, you cannot ignore other parameters such as area and the speed that you get. When dealing with PPA, having a memory compiler in place makes a lot of sense. Here, the engineering team can run the analysis and do a what-if analysis to look at all of the solutions that meet their target speed. You can talk about power optimization or area optimization, but in a generic case, you’ve got to clock the device at a certain speed.”


Fig. 1: Shmoo plot for memory IP. Source: Mentor, a Siemens Business

In Fig. 1, the range of frequency that the device is operating at is shown, along with the range of voltage that is operating. “People pay attention to this, and say, ‘I need to go 10% faster,’ for example,” Zarrinfar said. “One methodology that can be used is to overdrive the memory to get the gain of speed desired. The engineering team may get a low-power memory, but if they desperately need the speed for certain corners, they can overdrive or even underdrive the technology to meet their needs. As such, having full understanding of the ranges that the device will work at is very important. In older technology, such as 40nm, it was acceptable to just look at three corners — typical-typical, fast-fast and slow-slow. With the move down to 40nm, 28nm and 22nm and beyond, we need to look at five corners — typical-typical, fast-fast, slow-slow, fast-slow and slow-fast. This complicates matters.”

The biggest design difference now is that with LPDDR5, it’s not necessarily in an embedded environment.

“You’ve now got longer traces which LP originally was not designed to go on PCBs over very long distances,” said Rambus’ Ferro. “If you’re using an AI application, you have to look at signal terminations and in terms of your overall signal integrity, which was not necessarily a challenge in previous LP generations.”



Leave a Reply


(Note: This name will be displayed publicly)