Reliability Evaluation and Fault Tolerance Design for FPGA Implemented Reed Solomon (RS) Erasure Decoders

被引:1
|
作者
Gao, Zhen [1 ]
Shi, Jinchang [1 ]
Liu, Qiang [1 ]
Ullah, Anees [2 ]
Reviriego, Pedro [3 ]
机构
[1] Tianjin Univ, Tianjin Int Engn Inst, Sch Elect & Informat Engn, Sch Microelect, Tianjin 300072, Peoples R China
[2] Univ Engn & Technol, Dept Elect Engn, Peshawar 220101, Abbottabad, Pakistan
[3] Univ Politecn Madrid, Dept Ingn Sistemas Telemat, Madrid 28040, Spain
关键词
Duplication with comparison (DWC); fault-tolerant; field-programmable gate array (FPGA); Reed Solomon erasure codes (RS-ECs); single event upsets (SEUs);
D O I
10.1109/TVLSI.2022.3224137
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reed-Solomon erasure codes (RS-ECs) are widely applied in storage and packet communication systems to recover erasures. When implemented on a field-programmable gate array (FPGA) in a space platform, the RS-EC decoder will suffer single event upsets (SEUs) that can cause failures. In this brief, the reliability of an RS-EC decoder implemented on an FPGA to errors on the configuration memory is first studied based on hardware SEU injection experiments. We found that the reliability is lower for larger number of erased symbols, but there are still about 85% SEUs can be tolerated by the decoder itself even for the maximum number of erased symbols within the recovery capability. In addition, around 10%-25% SEUs on critical bits can cause system exceptions. Based on these results, a duplication with comparison (DWC) scheme is proposed for the protection of the RS-EC decoder. In particular, a checksum parity-based approach is proposed to detect the faulty decoder to reduce the computation overhead. Experimental results show that the reliability of the DWC protected RS-EC decoder to SEUs on the configuration memory is almost the same of a traditional triple modular redundancy (TMR) protection, and the resource usage is only about 2.15x that of the unprotected decoder.
引用
收藏
页码:142 / 146
页数:5
相关论文
共 50 条
  • [31] Comparison of FPGA and Microcontroller Implementations of an Innovative Method for Error Magnitude Evaluation in Reed-Solomon Codes
    Bianchi, Valentina
    Bassoli, Marco
    De Munari, Ilaria
    ELECTRONICS, 2020, 9 (01)
  • [32] Software Implemented Fault Detection And Fault Tolerance Mechanisms - PART II: Experimental evaluation of error
    Gawkowski, Piotr
    Sosnowski, Janusz
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2005, 51 (03) : 495 - 508
  • [33] A fault tolerance protocol for uploads: Design and evaluation
    Cheung, L
    Chou, CF
    Golubchik, L
    Yang, Y
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 136 - 145
  • [34] FPGA On-Board Computer design based on Hierarchical Fault tolerance
    Xing, Lei
    Sun, Zhaowei
    Xu, Guodong
    2008 2ND INTERNATIONAL SYMPOSIUM ON SYSTEMS AND CONTROL IN AEROSPACE AND ASTRONAUTICS, VOLS 1 AND 2, 2008, : 1212 - 1216
  • [35] Design time reliability analysis of distributed fault tolerance algorithms
    Latronico, E
    Koopman, P
    2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 486 - 495
  • [36] DESIGN STUDY OF SOFTWARE-IMPLEMENTED FAULT-TOLERANCE (SIFT)COMPUTER.
    WENSLEY, J.H.
    GOLDBERG, J.
    GREEN, M.W.
    KAUTZ, W.H.
    LEVITT, K.N.
    MILLS, M.E.
    SHOSTAK, R.E.
    WHITING-O'KEEFE, P.M.
    ZEIDLER, H.M.
    1982,
  • [37] Fault Tolerance in FPGA Architecture Using Hardware Controller-Design Approach
    Naseer, M.
    Sharma, Prashant
    Kshirsagar, Ravi
    2009 INTERNATIONAL CONFERENCE ON ADVANCES IN RECENT TECHNOLOGIES IN COMMUNICATION AND COMPUTING (ARTCOM 2009), 2009, : 906 - +
  • [38] Reliability Indicators for Automatic Design and Analysis of Fault-Tolerant FPGA Systems
    Lojda, Jakub
    Podivinsky, Jakub
    Kotasek, Zdenek
    2019 20TH IEEE LATIN AMERICAN TEST SYMPOSIUM (LATS), 2019,
  • [39] Validation of the Proposed Hardness Analysis Technique for FPGA Designs to Improve Reliability and Fault-Tolerance
    Khatri, Abdul Rafay
    Hayek, Ali
    Boeresoek, Josef
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (12) : 1 - 8
  • [40] Reliability Evaluation and Fault Tolerant Design for KLL Sketches
    Gao, Zhen
    Zhu, Jinhua
    Reviriego, Pedro
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2024, 12 (04) : 1002 - 1013