FAIL*: An Open and Versatile Fault-Injection Framework for the Assessment of Software-Implemented Hardware Fault Tolerance

被引:41
|
作者
Schirmeier, Horst [1 ]
Hoffmann, Martin [2 ]
Dietrich, Christian [2 ]
Lenz, Michael [1 ]
Lohmann, Daniel [2 ]
Spinczyk, Olaf [1 ]
机构
[1] Tech Univ Dortmund, Dept Comp Sci 12, Dortmund, Germany
[2] Univ Erlangen Nurnberg, Chair Distributed Syst & Operating Syst, Erlangen, Germany
关键词
DEPENDABILITY; ERRORS;
D O I
10.1109/EDCC.2015.28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.
引用
收藏
页码:245 / 255
页数:11
相关论文
共 50 条
  • [41] Software-implemented fault detection for high-performance space applications
    Turmon, M
    Granat, R
    Katz, DS
    DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, : 107 - 116
  • [42] Evaluating SEU Fault-Injection on Parallel Applications Implemented on Multicore Processors
    Vargas, Vanessa
    Ramos, Pablo
    Velazco, Raoul
    Mehaut, Jean-Francois
    Zergainoh, Nacer-Eddine
    2015 IEEE 6TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS & SYSTEMS (LASCAS), 2015,
  • [43] Flight Control Software Failure Mitigation: Design Optimization for Software-implemented Fault Detectors
    Morozov, Andrey
    Janschek, Klaus
    IFAC PAPERSONLINE, 2016, 49 (17): : 248 - 253
  • [44] Evaluation of Software-Implemented Fault-Tolerance (SIFT) approach in gracefully degradable multi-computer systems
    Avresky, Dimiter R.
    Geoghegan, Sean J.
    Varoglu, Yavuz
    IEEE TRANSACTIONS ON RELIABILITY, 2006, 55 (03) : 451 - 457
  • [45] An Effective Fault-Injection Framework for Memory Reliability Enhancement Perspectives
    Harcha, G.
    Bosio, A.
    Girard, P.
    Virazel, A.
    Bernardi, P.
    2017 12TH IEEE INTERNATIONAL CONFERENCE ON DESIGN & TECHNOLOGY OF INTEGRATED SYSTEMS IN NANOSCALE ERA (DTIS 2017), 2017,
  • [46] Towards a Hardware Fault-Injection Testbed to Support Reproducible Resiliency Experiments
    Sass, Ron
    Sharma, Rahul R.
    DeBardeleben, Nathan
    RESILIENCE 2009: WORKSHOP ON RESILIENCY IN HIGH-PERFORMANCE COMPUTING, 2009, : 15 - 22
  • [47] FTC: A Universal Framework for Fault-Injection Attack Detection and Prevention
    Muttaki, Md Rafid
    Rahman, Md Habibur
    Kulkarni, Akshay
    Tehranipoor, Mark
    Farahmandi, Farimah
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (07) : 1311 - 1324
  • [48] Experimental evaluation of hardware/software fault tolerance
    Gawkowski, P
    Sosnowski, J
    Anderson, E
    Zalewski, J
    PROGRAMMABLE DEVICES AND SYSTEMS, 2000, : 111 - 116
  • [49] The chameleon infrastructure for adaptive, software implemented fault tolerance
    Bagchi, S
    Whisnant, K
    Kalbarczyk, Z
    Iyer, RK
    SEVENTEENTH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 261 - 267
  • [50] Fault-Injection for Software-in-the-Loop Testing of Networked Railway Systems
    Pieper, Tobias
    Obermaisser, Roman
    2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 49 - 52