Detecting Defect-Induced Silent Data Corruptions in CPUs (Stanford, Google)


Researchers from Stanford University and Google have published “ITHICA: Intra-Thread Instruction Checking Approach for Defect-Induced Silent Data Corruptions”. Abstract “Hyperscaler reports of silent data corruptions (SDCs)—presumed to be caused by silicon manufacturing defects—have motivated the development of functional tests for detecting defective CPUs and their use in h... » read more

The Severity Of Test Escapes And SDCs Caused By Them (Google)


A new technical paper titled "Silent Data Corruption by 10x Test Escapes Threatens Reliable Computing" was published by Google. Abstract "Too many defective compute chips are escaping existing manufacturing tests -- at least an order of magnitude more than industrial targets across all compute chip types in data centers. Silent data corruptions (SDCs) caused by test escapes, when left unadd... » read more

Mitigating Silent Data Corruptions in High Performance Computing


A new technical paper titled "Mitigating silent data corruptions in HPC applications across multiple program inputs" was published by researchers at University of Iowa, Baidu Security, and Argonne National Lab. The paper was a Best Paper finalist at SC22. The researchers "propose MinpSID, an automated SID framework that automatically identifies and re-prioritizes incubative instructions in a... » read more