Application-level fault tolerance as a complement to system-level fault tolerance

被引:14
|
作者
Haines, J [1 ]
Lakamraju, V [1 ]
Koren, I [1 ]
Krishna, CM [1 ]
机构
[1] Univ Massachusetts, Dept Elect & Comp Engn, Amherst, MA 01003 USA
来源
JOURNAL OF SUPERCOMPUTING | 2000年 / 16卷 / 01期
关键词
distributed real-time systems; fault tolerance; checkpointing; imprecise computation; target tracking; beam forming;
D O I
10.1023/A:1008181429693
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As multiprocessor systems become more complex, their reliability will need to increase as well. In this paper we propose a novel technique which is applicable to a wide variety of distributed real-time systems, especially those exhibiting data parallelism. System-level fault tolerance involves reliability techniques incorporated within the system hardware and software whereas application-level fault tolerance involves reliability techniques incorporated within the application software. We assert that, for high reliability, a combination of system-level fault tolerance and application-level fault tolerance works best. In many systems, application-level fault tolerance can be used to bridge the gap when system-level fault tolerance alone does not provide the required reliability. We exemplify this with the RTHT target tracking benchmark and the ABF beamforming benchmark.
引用
收藏
页码:53 / 68
页数:16
相关论文
共 50 条
  • [31] Instruction-Level Fault Tolerance Configurability
    Demid Borodin
    B. H. H. (Ben) Juurlink
    Said Hamdioui
    Stamatis Vassiliadis
    Journal of Signal Processing Systems, 2009, 57 : 89 - 105
  • [32] Multi Level Fault Tolerance in Cloud Environment
    Devi, K.
    Paulraj, D.
    2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 824 - 828
  • [33] Instruction-level fault tolerance configurability
    Borodin, Demid
    Juurlink, B. H. H.
    Vassiliadis, Stamatis
    IC-SAMOS: 2007 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION, PROCEEDINGS, 2007, : 110 - +
  • [34] Instruction-Level Fault Tolerance Configurability
    Borodin, Demid
    Juurlink, B. H. H.
    Hamdioui, Said
    Vassiliadis, Stamatis
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2009, 57 (01): : 89 - 105
  • [35] Node grouping in system-level fault diagnosis
    Dafang Zhang
    Gaogang Xie
    Yinghua Min
    Journal of Computer Science and Technology, 2001, 16 : 474 - 479
  • [36] An Evolutionary Approach to System-Level Fault Diagnosis
    Yang, Hui
    Elhadef, Mourad
    Nayak, Amiya
    Yang, Xiaofan
    2009 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-5, 2009, : 1406 - +
  • [37] Node Grouping in System-Level Fault Diagnosis
    张大方
    谢高岗
    闵应骅
    JournalofComputerScienceandTechnology, 2001, (05) : 474 - 479
  • [38] SYSTEM-LEVEL FAULT-DIAGNOSIS - A SURVEY
    KREUTZER, SE
    HAKIMI, SL
    MICROPROCESSING AND MICROPROGRAMMING, 1987, 20 (4-5): : 323 - 330
  • [39] Node grouping in system-level fault diagnosis
    Zhang, DF
    Xie, GG
    Min, YH
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (05) : 474 - 479
  • [40] Towards runtime system level fault tolerance for a distributed functional language
    Trinder, P
    Pointon, R
    Loidl, HW
    TRENDS IN FUNCTIONAL PROGRAMMING, VOL 2, 2000, : 103 - 114