中文 English

Data Integrity For JEDEC DRAM Memories

Multiple approaches are needed to deal with data errors in the latest high-speed memories.

popularity

With the DRAM fabrication advancing from 1x to 1y to 1z and further to 1a, 1b, and 1c nodes along with the DRAM device speeds going up to 8533 for LPDDR5 and 8800 for DDR5, data integrity is becoming a really important issue that the OEMs and other users have to consider as part of the system that relies on the correctness of data being stored in the DRAMs for system to work as designed.

It’s a complicated problem that requires multiple ways to deal with it.

Traditionally, one of the main approaches to deal with data errors is to rely on the ECC. ECC requires additional memory storage in which the ECC codes will be calculated and stored at the time of memory write to DRAM. These codes will be read back along with the memory data during to the reads and checked against the data to make sure that there are no errors. Typical ECC schemes use Hamming codes that provide for single bit error correction and double bit error detection per burst. Also, while several previous generations of DRAM required Host to keep aside system memory for ECC storage, the latest DRAMs like LPDDR5 and DDR5 support on die ECC as part of the normal DRAM function that can be enabled using mode registers. DDR5 further requires Host to run through an ECC Error Check and Scrub (ECS) cycle on an average every tECSint time (Average Periodic ECS Interval) to prevent data errors.

Not meeting the DRAM refresh requirement is a major factor that can lead to loss of data. This could be challenging as the PVT variation can cause the refresh requirement to change over time. Putting the DRAM in Self Refresh mode can help off-loading Refresh tracking responsibilities to DRAM but may prevent Host from doing other scheduling optimizations and should be carefully considered.

Some of the other things that can affect the DRAM data are:

  1. Row hammer where same or adjacent rows are activated again and again leading to loss or changing of data contents in the rows that has not being addressed. Latest DRAMs like LPDDR5/DDR5 support Refresh Management (including DRFM and ARFM) that allows the Host to compensate for these problems by issuing dedicated RFM commands, helping DRAMs deal with potential data loss issues arising out of row hammer attacks.
  2. Device temperature is another important factor that the Host needs to be aware of and if the application requires DRAM to operate at elevated temperature. The user needs to check with the DRAM vendor on the temperature range that the DRAM can still operate. Data integrity at thresholds greater than a certain temperature is not assured regardless of refresh rate unless the DRAM is manufactured to withstand that.
  3. Loss of power to DRAM will cause DRAM to lose all its contents. If this is a real concern for the system designer, they should consider using NVDIMM-N devices, which has an on-chip controller and a power source which is just enough to allow the DRAM contents to be copied into a backup non-volatile memory before power is lost. When the power is stored back, the stored memory contents in the non-volatile memory will be written back to the DRAM and system can continue to operate as it was before the power loss event occurred.

For transmissions and manufacturing errors, DRAMs support additional features like CRC, DFE, Pre-Emphasis, and PPR.

Cadence MMAV VIPs for DDR5/DDR5 DIMM and LPDDR5 are compressive VIP solutions and support all of the above-listed data integrity features, including support for ECC error injection and SBE correction/DBE detection to assist with the verification challenges dealing with data integrity issues.

More information on Cadence DDR5/LPDDR5 VIP is available at Cadence VIP Memory Models Website.



Leave a Reply


(Note: This name will be displayed publicly)