Mitigating Silent Data Corruptions in High Performance Computing


A new technical paper titled “Mitigating silent data corruptions in HPC applications across multiple program inputs” was published by researchers at University of Iowa, Baidu Security, and Argonne National Lab. The paper was a Best Paper finalist at SC22.

The researchers “propose MinpSID, an automated SID framework that automatically identifies and re-prioritizes incubative instructions in a given program to enhance SDC coverage. Evaluation shows MinpSID can effectively mitigate the loss of SDC coverage across multiple inputs,” states the paper.

Find the technical paper here or here. Published November 2022. Presentation slides are here.

Huang, Yafan, et al. “Mitigating silent data corruptions in HPC applications across multiple program inputs.” Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 2022.

