BOUNDS ON ALGORITHM-BASED FAULT TOLERANCE IN MULTIPLE PROCESSOR SYSTEMS.

被引:38
|
作者
Banerjee, Prithviraj [1 ]
Abraham, Jacob A. [1 ]
机构
[1] Univ of Illinois, Urbana, IL, USA, Univ of Illinois, Urbana, IL, USA
关键词
COMPUTER PROGRAMMING - Algorithms - MATHEMATICAL PROGRAMMING; LINEAR - MATHEMATICAL TECHNIQUES - Graph Theory;
D O I
10.1109/TC.1986.1676762
中图分类号
学科分类号
摘要
The authors present a graph-theoretic model for determining upper and lower bounds on the number of checks needed for achieving concurrent fault detection and location. The objective is to estimate the overhead in time and the number of processors required for such a scheme. Faults in processors, errors in the data, and checks on the data to detect and locate errors are represented as a tripartite graph. Bounds on the time and processor overhead are obtained by considering a series of subproblems. First, using some crude concepts for t-fault detection and t-fault location, bounds on the maximum size of the error patterns that can arise from such fault patterns are obtained. Using these results, bounds are derived on the number of checks required for error detection and location. Some numerical results are derived from a linear programming formulation.
引用
收藏
页码:296 / 306
相关论文
共 50 条
  • [21] A High-dimensional Algorithm-Based Fault Tolerance Scheme
    Fu, Xiang
    Tang, Hao
    Liao, Huimin
    Huang, Xin
    Xu, Wubiao
    Meng, Shiman
    Zhang, Weiping
    Guo, Luanzheng
    Sato, Kento
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 326 - 330
  • [22] Parallel Reduction to Hessenberg Form with Algorithm-Based Fault Tolerance
    Jia, Yulu
    Bosilca, George
    Dongarra, Jack J.
    2013 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2013,
  • [23] Algorithm-based fault tolerance for spaceborne computing: Basis and implementations
    Turmon, M
    Granat, R
    2000 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOL 4, 2000, : 411 - 420
  • [24] Algorithm-based fault tolerance applied to high performance computing
    Bosilca, George
    Delmas, Remi
    Dongarra, Jack
    Langou, Julien
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2009, 69 (04) : 410 - 416
  • [25] Efficacy and Efficiency of Algorithm-Based Fault-Tolerance on GPUs
    Wunderlich, Hans-Joachim
    Braun, Claus
    Raider, Sebastian
    PROCEEDINGS OF THE 2013 IEEE 19TH INTERNATIONAL ON-LINE TESTING SYMPOSIUM (IOLTS), 2013, : 240 - 243
  • [26] CONSTRUCTION OF CHECK SETS FOR ALGORITHM-BASED FAULT-TOLERANCE
    GU, DC
    ROSENKRANTZ, DJ
    RAVI, SS
    IEEE TRANSACTIONS ON COMPUTERS, 1994, 43 (06) : 641 - 650
  • [27] Algorithm-Based Fault Tolerance for Fail-Stop Failures
    Chen, Zizhong
    Dongarra, Jack
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, 19 (12) : 1628 - 1641
  • [28] ON FAULT TOLERANCE IN MANUFACTURING SYSTEMS.
    Chintamaneni, Prasad R.
    Jalote, Pankaj
    Shieh, Yuan-Bao
    Tripathi, Satish K.
    IEEE Network, 1988, 2 (03): : 32 - 39
  • [29] FAULT-TOLERANCE CONSIDERATIONS IN LARGE, MULTIPLE-PROCESSOR SYSTEMS
    KUHL, JG
    REDDY, SM
    COMPUTER, 1986, 19 (03) : 56 - 67
  • [30] ALMOST CERTAIN FAULT-DIAGNOSIS THROUGH ALGORITHM-BASED FAULT-TOLERANCE
    BLOUGH, DM
    PELC, A
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (05) : 532 - 539