Reliability Evaluation and Fault Tolerance Design for FPGA Implemented Reed Solomon (RS) Erasure Decoders

被引:1
|
作者
Gao, Zhen [1 ]
Shi, Jinchang [1 ]
Liu, Qiang [1 ]
Ullah, Anees [2 ]
Reviriego, Pedro [3 ]
机构
[1] Tianjin Univ, Tianjin Int Engn Inst, Sch Elect & Informat Engn, Sch Microelect, Tianjin 300072, Peoples R China
[2] Univ Engn & Technol, Dept Elect Engn, Peshawar 220101, Abbottabad, Pakistan
[3] Univ Politecn Madrid, Dept Ingn Sistemas Telemat, Madrid 28040, Spain
关键词
Duplication with comparison (DWC); fault-tolerant; field-programmable gate array (FPGA); Reed Solomon erasure codes (RS-ECs); single event upsets (SEUs);
D O I
10.1109/TVLSI.2022.3224137
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reed-Solomon erasure codes (RS-ECs) are widely applied in storage and packet communication systems to recover erasures. When implemented on a field-programmable gate array (FPGA) in a space platform, the RS-EC decoder will suffer single event upsets (SEUs) that can cause failures. In this brief, the reliability of an RS-EC decoder implemented on an FPGA to errors on the configuration memory is first studied based on hardware SEU injection experiments. We found that the reliability is lower for larger number of erased symbols, but there are still about 85% SEUs can be tolerated by the decoder itself even for the maximum number of erased symbols within the recovery capability. In addition, around 10%-25% SEUs on critical bits can cause system exceptions. Based on these results, a duplication with comparison (DWC) scheme is proposed for the protection of the RS-EC decoder. In particular, a checksum parity-based approach is proposed to detect the faulty decoder to reduce the computation overhead. Experimental results show that the reliability of the DWC protected RS-EC decoder to SEUs on the configuration memory is almost the same of a traditional triple modular redundancy (TMR) protection, and the resource usage is only about 2.15x that of the unprotected decoder.
引用
收藏
页码:142 / 146
页数:5
相关论文
共 50 条
  • [41] Self-healing and Fault-tolerance Abilities Development in Embryonic Systems Implemented with FPGA-based Hardware
    Szasz, Cs.
    Chindris, V.
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS, 2009, : 196 - 201
  • [42] Improving Fault Tolerance for FPGA SoCs through Post-Radiation Design Analysis
    Wilson, Andrew Elbert
    Baker, Nathan
    Campbell, Ethan
    Wirthlin, Michael
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (03)
  • [43] Design and implementation of a fault injection mechanism for software reliability evaluation
    Hu, Jiawei
    Jiang, Jianhui
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2012, 24 (06): : 741 - 751
  • [44] Evaluation Platform For Testing Fault Tolerance: Testing Reliability of Smart Electronic Locks
    Podivinsky, Jakub
    Lojda, Jakub
    Panek, Richard
    Cekan, Ondrej
    Krcma, Martin
    Kotasek, Zdenek
    2020 IEEE 11TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS & SYSTEMS (LASCAS), 2020,
  • [45] Performance and reliability evaluation of passive replication schemes in application level fault tolerance
    Garg, S
    Huang, YN
    Kintala, CMR
    Trivedi, KS
    Yajnik, S
    TWENTY-NINTH ANNUAL INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING, DIGEST OF PAPERS, 1999, : 322 - 329
  • [46] Reliability Limits of TMR Implemented in a SRAM-based FPGA: Heavy Ion Measures vs. Fault Injection Predictions
    Foucard, Gilles
    Peronnard, Paul
    Velazco, Raoul
    JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2011, 27 (05): : 627 - 633
  • [47] Reliability Limits of TMR Implemented in a SRAM-based FPGA: Heavy Ion Measures vs. Fault Injection Predictions
    Gilles Foucard
    Paul Peronnard
    Raoul Velazco
    Journal of Electronic Testing, 2011, 27 : 627 - 633
  • [48] EVALUATION AND DESIGN OF AN ULTRA-RELIABLE DISTRIBUTED ARCHITECTURE FOR FAULT TOLERANCE
    WALTER, CJ
    IEEE TRANSACTIONS ON RELIABILITY, 1990, 39 (04) : 492 - 499
  • [49] Design, evaluation and fault-tolerance analysis of stochastic FIR filters
    Wang, Ran
    Han, Jie
    Cockburn, Bruce F.
    Elliott, Duncan G.
    MICROELECTRONICS RELIABILITY, 2016, 57 : 111 - 127
  • [50] High-Level Reliability Evaluation of Reconfiguration-Based Fault Tolerance Techniques
    Tien Thanh Nguyen
    Thevenin, Mathieu
    Mouraud, Anthony
    Corre, Gwenole
    Pasquier, Olivier
    Pillement, Sebastien
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 202 - 205