More Memory And Processor Tradeoffs

Why power, performance and area are becoming increasingly difficult to balance.


Creating a new chip architecture is becoming an increasingly complex series of tradeoffs about memories and processing elements, but the benefits are not always obvious when those tradeoffs are being made.

This used to be a fairly straightforward exercise when there was one processor, on-chip SRAM and off-chip DRAM. Fast forward to 7/5nm, where chips are being developed for AI, mobile phones and servers, and there are hundreds or even thousands of processing elements connected on a single die or in a complex package. In many of these chips there are a mix of processor types—small FPGAs, embedded FPGAs, DSPs, as well as various sizes of processing cores based on Arm, MIPS, RISC-V or x86 ISAs. There are also AI cores such as Tensor processing units, as well as some custom processing circuitry.

Working alongside all of those are multiple types of memories that are equally complex. There are various 2D and 3D versions of RAM, NAND flash and NOR flash, as well as phase-change memory (3D-XPoint), high-performance external memory such as HBM (stacked DRAM modules) and GDDR6. Add to that list MRAM, ReRAM, and emerging technologies such as FRAM and STT-MRAM. Not all of these will achieve critical mass, but at this point it’s too early to tell which ones will survive, let alone replace DRAM, SRAM and various flash technologies.

Just reading off a spec sheet these days is nightmarish, but it gets worse. Different memories also behave differently under different operating conditions, such as high heat, extreme cold, and how many times they are accessed. There also are such variables to consider as simulated lifespan, which is a function of how many read-write cycles a certain memory type is predicted to execute reliably under expected ambient conditions and use models. That could change depending upon the environment—think about memory in an industrial operation or a car, for example, and it could vary by workload. Add to that list factors such as error recovery, resiliency and min/max voltage.

The third leg of the PPA stool is area. Memory is a space hog. In some chips, memory accounts for 60% or more of the total area on a die. Being able to get data in and out of memory quickly enough has a significant impact on the performance and power requirements of a system. Add in more margin for things like resiliency and error recovery, and the entire system slows down and requires more power to drive signals. Move data too far—which could be from a processor at one end of a large chip to memory at the other end of the die—or drive signals through wires that are too thin, and the power requirements increase while performance decreases.

Put all of these pieces together, and the choices become even more challenging to keep track of. Reduce the voltage and some memories behave differently. Reduce it too far and certain memories won’t work as well or as long. Add in peak current changes based upon unexpected or planned use models, and suddenly this begins to look like a fine balancing act where any last-minute change can disrupt the whole operation and send the entire design team back to the drawing board. And that doesn’t even begin to take into account errors caused by process and other types of variation, or new architectures such as in-memory computing or near-memory computing.

So where is the silver lining in all of this? By architecting these systems correctly performance can increase dramatically, and power can be reduced significantly. Numbers as high as 1,000X, or even more, are very possible for customized designs and well-mapped workloads.

But getting there will require a lot of very complex tradeoffs that go well beyond anything design teams have encountered in the past. And from here it only becomes more complex as the industry begins looking at new transistor and memory structures at 5/3nm, new materials, and very different physical and electrostatic properties. More coffee, anyone?

Related Stories
Memory Tradeoffs Intensify In AI, Automotive Applications
Why choosing memories and architecting them into systems is becoming much more difficult.
Optimization Challenges For Safety And Security
The road to optimized tradeoff automation is long. Changing attributes along the way can make it even more difficult.
Using Memory Differently
Optimizing complex chips requires decisions about overall system architecture, and memory is a key variable.

Leave a Reply

(Note: This name will be displayed publicly)