silent data corruption (SDC)

Silent data corruption (SDC) are undetected errors in data output from an integrated circuit, caused by random manufacturing defects in a chip each of which is highly dependent on a specific instruction sequence.

As defined by the public release of v0.3 Resilience Workstream specification from OCP:  A faulty chip is categorized as SDC-causing if an identified bug-free sequence of instructions (e.g., a test created by screening tools or an actual application workload) delivers incorrect results without any indication from the built-in error detection mechanisms.

This definition was informed by an early theoretical attempt at the definition found in C. Constantinescu, I. Parulkar, R. Harper and S. Michalak, “Silent Data Corruption — Myth or reality?,” 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN),

Previous papers and presentations on these errors have used various terms and definitions for the physical manifestations and associated behaviors attributed to SDCs. Examples of previously used terms include corrupt execution errors and silent data errors. Note these errors should not be confused with the soft errors due to cosmic rays.