A Novel Approach for Handling Soft Error in Conjugate Gradients

被引:0
|
作者
Ozturk, Muhammed Emin [1 ]
Renardy, Marissa [2 ]
Li, Yukun [2 ]
Agrawal, Gagan [1 ]
Chou, Ching-Shan [2 ]
机构
[1] Ohio State Univ, Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Math, Columbus, OH 43210 USA
关键词
Fault-tolerance; Soft errors; Iterative Solvers; Conjugate Gradients;
D O I
10.1109/HiPC.2018.00030
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Soft errors or bit flips have recently become an important challenge in high performance computing. In this paper, we focus on soft errors in a particular algorithm: conjugate gradients (CG). We present a series of techniques to detect soft errors in CG. We first derive a mathematical quantity that is monotonically decreasing. Next, we add a set of heuristics and combine our approach with previously established methods. We have extensively evaluated our method considering three distinct dimensions. First, we show that the F-score of our detection is significantly better than two other methods. Second, we show that for soft errors that are not detected by our method, the resulting inaccuracy in the final results are small, and better than those with other methods. Finally, we show that the runtime overheads of our method are lower than for other methods.
引用
收藏
页码:193 / 202
页数:10
相关论文
共 50 条
  • [1] Error estimation in preconditioned conjugate gradients
    Strakos, Z
    Tichy, P
    BIT NUMERICAL MATHEMATICS, 2005, 45 (04) : 789 - 817
  • [2] Error Estimation in Preconditioned Conjugate Gradients
    Zdeněk Strakoš
    Petr Tichý
    BIT Numerical Mathematics, 2005, 45 : 789 - 817
  • [3] On The Sharpness of an Asymptotic Error Estimate for Conjugate Gradients
    B. Beckermann
    A. B. J. Kuijlaars
    BIT Numerical Mathematics, 2001, 41 : 856 - 867
  • [4] On the sharpness of an asymptotic error estimate for conjugate gradients
    Beckermann, B
    Kuijlaars, ABJ
    BIT, 2001, 41 (05): : 856 - 867
  • [5] A TRANSACTION APPROACH TO ERROR HANDLING
    RAFNEL, BA
    EDN, 1994, 39 (09) : 85 - &
  • [6] A TRANSACTION APPROACH TO ERROR HANDLING
    RAFNEL, BA
    HEWLETT-PACKARD JOURNAL, 1993, 44 (03): : 71 - 77
  • [7] Error Resilient Transformers: A Novel Soft Error Vulnerability Guided Approach to Error Checking and Suppression
    Ma, Kwondo
    Amarnath, Chandramouli
    Chatterjee, Abhijit
    2023 IEEE EUROPEAN TEST SYMPOSIUM, ETS, 2023,
  • [8] A Novel Optimum Data Duplication Approach for Soft Error Detection
    Xu, Jianjun
    Tan, Qingping
    Shen, Rui
    APSEC 2008:15TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2008, : 161 - 168
  • [9] Handling Soft Error in Embedded Software For Networking System
    Zhu, Haihong Henry
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2014, : 25 - 28
  • [10] A Novel Gate Grading Approach for Soft Error Tolerance in Combinational Circuits
    Ansari, Mohammad Saeed
    Mahani, Ali
    Han, Jie
    Cockburn, Bruce F.
    2016 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2016,