FAIL*: An Open and Versatile Fault-Injection Framework for the Assessment of Software-Implemented Hardware Fault Tolerance

被引:41
|
作者
Schirmeier, Horst [1 ]
Hoffmann, Martin [2 ]
Dietrich, Christian [2 ]
Lenz, Michael [1 ]
Lohmann, Daniel [2 ]
Spinczyk, Olaf [1 ]
机构
[1] Tech Univ Dortmund, Dept Comp Sci 12, Dortmund, Germany
[2] Univ Erlangen Nurnberg, Chair Distributed Syst & Operating Syst, Erlangen, Germany
关键词
DEPENDABILITY; ERRORS;
D O I
10.1109/EDCC.2015.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.
引用
收藏
页码:245 / 255
页数:11
相关论文
共 50 条
  • [31] Efficient Software-Implemented HW Fault Tolerance for TinyML Inference in Safety-critical Applications
    Sharif, Uzair
    Mueller-Gritschneder, Daniel
    Stahl, Rafael
    Schlichtmann, Ulf
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [32] A-SOFT-AES: Self-Adaptive Software-Implemented Fault-Tolerance for AES
    Oboril, Fabian
    Sagar, Ilias
    Tahoori, Mehdi B.
    PROCEEDINGS OF THE 2013 IEEE 19TH INTERNATIONAL ON-LINE TESTING SYMPOSIUM (IOLTS), 2013, : 104 - 109
  • [33] A STUDY ABOUT SOFTWARE-IMPLEMENTED FAULT INJECTION STRATEGY FOR DIGITAL RPS IN NUCLEAR POWER PLANT
    Xi, Wang
    Gu, Pengfei
    Bai, Tao
    Liu, Wei
    Chen, Weihua
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON NUCLEAR ENGINEERING, 2017, VOL 1, 2017,
  • [34] SWIFT: Software implemented fault tolerance
    Reis, GA
    Chang, J
    Vachharajani, N
    Rangan, R
    August, DI
    CGO 2005: INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2005, : 243 - 254
  • [35] Software implemented fault tolerance in hypercube
    Avresky, DR
    Geoghegan, S
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 515 - 518
  • [36] SOFTWARE IMPLEMENTED FAULT TOLERANCE - A METHODOLOGY
    LOMBARDI, F
    RODA, VO
    MICROELECTRONICS AND RELIABILITY, 1982, 22 (04): : 873 - 886
  • [37] Tests and tolerances for high-performance software-implemented fault detection
    Turmon, M
    Granat, R
    Katz, DS
    Lou, JZ
    IEEE TRANSACTIONS ON COMPUTERS, 2003, 52 (05) : 579 - 591
  • [38] AVR microcontroller simulator for a software implemented hardware fault tolerance algorithms research
    Piotrowski, Adam
    Tarnowski, Szymon
    Napieralski, Andrzej
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2007, PTS 1 AND 2, 2007, 6937
  • [39] FAIL-FCI: Versatile fault injection
    Hoarau, William
    Tixeuil, Sebastien
    Vauchelles, Fabien
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS, 2007, 23 (07): : 913 - 919
  • [40] Assessing software implemented fault detection and fault tolerance mechanisms
    Gawkowski, P
    Sosnowski, J
    ATS 2003: 12TH ASIAN TEST SYMPOSIUM, PROCEEDINGS, 2003, : 462 - 467