Non-Volatile Memory Tradeoffs Intensify

Why NVM is becoming so application-specific and what the different options are.


Non-volatile memory is becoming more complicated at advanced nodes, where price, speed, power and utilization are feeding into some very application-specific tradeoffs about where to place that memory.

NVM can be embedded into a chip, or it can be moved off chip with various types of interconnect technology. But that decision is more complicated than it might first appear. It depends on the process node, the voltage, the type of NVM and what’s being stored in it, as well as the overall chip or system budget.

“The highest-performing processors use the smallest process geometries, which in turn will place the highest demand on NVM,” said Steven Woo, fellow and distinguished inventor at Rambus. “Some of the challenges for NVM are the relative difficulty in scaling the capacity at smaller geometries, and the need to implement higher voltages to program the cells. More die area may be needed to support capacities required by the additional processing cores at finer process geometries, and additional manufacturing cost may be required to support higher voltages.”

This has turned into a balancing act between the power/performance improvements of smaller geometries and how much memory can be embedded cost-effectively.

“In non-volatile memory, as you go below 40nm, the cost of embedding it becomes extremely high,” said Raphael Mehrbians, vice president and general manager of the Memory Products Division at Adesto Technologies. “As a result, you may end up using a lot more SRAM internally, but then delegate the NVM to an outside device. But when you do that, the challenge becomes having enough bandwidth performance to be able to execute out of that in a power-efficient way.”

Some MCU companies are moving to external memory instead of internal memory, and using higher-performance octal NVM to be able to do that, Mehrbians said.

One of the advantages of doing this is simplicity. So bare NAND devices are trending into solid state drives and storage cards, with fewer requirements for devices (storage controllers excepted) that need to interface directly with the bare NAND,” said Marc Greenberg, group director of product marketing at Cadence.

“With increased diversity of devices on the SPI (Serial Peripheral Interface) bus, and the proliferation of higher speed SPI interfaces — specifically quad SPI, octal SPI, xSPI — it’s starting to become a very interesting bus beyond booting to, executing directly from this bus,” said Greenberg.

This approach is evident in a microcontroller line from NXP that does not have on-chip embedded memory.

“They’ve been extending this family over the last two years, and all of the products rely on external memory rather than on chip memory,” said Gideon Intrater, CTO at Adesto. “By doing that, they actually end up with designs that are more price-competitive than other vendors and they still deliver more performance than their competitors because of the lower power.”

NXP has been able to reach 600 MHz, compared to other vendors that taped out with the same Arm core, but only hit 400 MHz. Those MCUs deliver 50% higher frequency because they use a process that doesn’t have flash memory on it. Even at 40nm, there are differences between the version of the process that supports embedded and non-embedded. Below that it becomes more and more difficult. Sometimes it’s just cost, but sometimes it’s also the capabilities of the process, Intrater said.

“In the past you could build an MCU on older process technologies that have easily accommodated non-volatile memory, but the demands on the IoT devices in the edge of the network are increasing all the time,” he said. “You want to do AI at the edge, you want to make smarter nodes, so you need to deliver more performance. And delivering that performance in an old process requires a lot of power, so in order to match the requirements of the power and the performance, people are moving to designs at 28nm — and even below for the edge — and in this process node there’s just not any non-volatile memory.”

It’s a different way of looking at the problem. “If you look at a flash memory/NVM, in a typical design, 99% of the time it’s in sleep mode,” said Mehrbians. “You’re downloading the firmware into an external memory or into an SRAM. Then you execute from there. Therefore, most of these memories are designed to be lowest power when they’re in sleep mode, but once you’re executing out of it, then that is not the best design anymore. You want a design that is lowest power in active mode.”

Making tradeoffs
This doesn’t work in all cases, and the tradeoffs between keeping NVM on-chip or moving it off-chip are application-dependent.

“There are certain applications that still need large amounts of NVM, and if you’re talking about 4 Mbytes and above, then you’re talking about NVM not being such a viable option to embed,” said Krishna Balachandran, product marketing manager for NVM IP at Synopsys. “It’s expensive to embed. In that situation, it makes sense to use either discrete memory for NVM, like a flash chip (NOR flash typically), or embedded flash. However, embedded flash is why it’s expensive, because an embedded flash process involves extra masks, extra steps, and the mature embedded flash process lags the latest and most advanced logic process.”

While lots of silicon is running in logic processes at 10/7nm, with users now talking about moving to 5nm, embedded flash is mainstream on 40nm and 28nm. “Below that you have nothing,” said Balachandran. “There is no finFET embedded flash process at the moment, and this is because it’s very expensive to develop, it’s very expensive to deploy, and to obtain good yields. It’s just not easy to do it technically. That’s where the cost angle stems from. At 40nm, there is an embedded flash process. It’s mature, it works just fine from a technology point of view. The issue is all the extra steps and masks that must be put in adds cost to the silicon. Also, in an NVM, there is extra testing required. You have to bake the chip. You have to heat it to a high temperature to make sure that the memory doesn’t just go away — that it’s not volatile. You have to put it to these stringent tests to mimic the aging of the memory, including mimicking high-temperature harsh surroundings. It might be to ensure that the data is retained over a long period of time. For a commercial application, typically the spec is 10 years. If it’s for automotive, the spec is 15 years. So much depends on the application.”

As to why engineering teams are embedding NVM, it is for achieving some power reduction, or a reduction of chip count. It’s the same reasons for embedding SRAM or any other type of memory onto a chip, Balachandran said.

Dissecting NVM
Fundamentally, there are two types of NVM. Multi-time programmable (MTP) NVM can be programmed many times. One-time programmable (OTP) NVM can be programmed once.

Some MTP NVM will work with a standard CMOS process, whereby no extra steps or masks are involved. Instead, it can be manufactured using a standard CMOS process, which means it can continue to be scaled. But the MTP version requires a floating gate, like a flash cell.

“This means there is a charge that is trapped on a floating gate,” Balachandran said. “Then there’s the regular gate and the transistor. When you erase it, you remove the charge from the floating gate. Also, this floating gate requires a thicker oxide, and not all processes offer that. This is why MTP scaling basically stopped at 40nm and 28nm. Beyond that, it’s difficult to do it because the oxide thickness is not there to do to make it happen.”

However, if NVM could be embedded in the same logic process without having to make tweaks to the process, then the costs are more manageable, and this is exactly what Synopsys was after with its acquisition of Sidense and Kilopass, both of which developed versions of OTP NVM.

“Synopsys saw what was happening, which is why we acquired two companies in the OTP space, because the OTP technology doesn’t require the thicker oxide that is required for the MTP, and there is no floating gate,” Balachandran said. “In this technology, called antifuse, the principle of a cell is quite different from a traditional flash type of cell in the sense that there is no charge that is being trapped or being removed from the so-called floating gate. Instead, we break the oxide to allow conduction in the channel to make it a 0. By breaking the oxide, then applying high voltage, you then have conduction in the channel, and that becomes the 1. That’s how you distinguish between 0 and 1. Once you break the oxide, it’s not reversible, and since it’s not reversible, you cannot change it from a 1 to 0 again. Once you’ve made it a 1, it stays a 1 so that means it’s only one-time programmable.”

Interestingly, this OTP NVM is scalable, he pointed out, “with no fundamental limit as to where to go because every logic process has an oxide and you can always find the right voltage to break the oxide, so there’s nothing physically limiting us from doing that.”

Potential applications for this technology include unique chip IDs, as well as calibrations for analog circuits, which tend to drift over time. That includes digital-to-analog converters where the trim settings are stored inside, and pixel correction for pixels that do not appear as they should. With pixel correction, the display driver chip has the OTP in it, so that on power-up the chip can read it and correct the pixels based on coefficients stored in that OTP.

“You might want to use this OTP NVM for security,” Balanchandran said. “Let’s say you have a mobile chip and you’re trying to use Alipay or Google Pay or Apple Pay, and you want to authenticate. OTP NVM could be used for key storage, typically once. You don’t need to re-program it that many times. You’re not going to keep changing your Apple ID. It’s not a huge number of bits, but you store it and then you can use it to authenticate. Also, for financial transactions, it can be used to authenticate copyrighted information. In a set-top box application, it can make sure that the set-top box user is not hacking it and getting the content for free. Here, the main SoC can implement hardware security instead of implementing it at the software level. This is more fundamental, and it’s much harder to hack, so you store it in an OTP. Then, when the chip powers up, it reads this key from the OTP and only if you bought the access, then you can view the particular content.”

With applications across a wide spectrum demanding specialized memory access, engineering teams have their work cut out for them when it comes to choosing how and where to place NVM. The tradeoffs between smaller geometrics need to be balanced against cost, and that will vary greatly by application.

In AI, for example, smaller nodes are required just to fit more processors and accelerators, which in turn requires higher voltages. But whether NVM is kept on-chip or moved off-chip isn’t always obvious, particularly in cases where that memory is used infrequently or where adding a different voltage rail is required.

Leave a Reply

(Note: This name will be displayed publicly)