Predicting and mitigating single-event upsets in DRAM using HOTH

被引:2
|
作者
Longofono, Stephen [1 ]
Kline, Donald, Jr. [1 ]
Melhem, Rami [2 ]
Jones, Alex K. [1 ]
机构
[1] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
[2] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会;
关键词
Radiation test; Memory reliability; Fault map; DRAM; MEMORY; TECHNOLOGY; ECC;
D O I
10.1016/j.microrel.2020.114024
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
There is a growing demand for using commodity memory and storage solutions to make commercial aerospace ventures economically feasible. Existing radiation-hardened computer systems cannot meet this need alone. These hardened systems provide sufficient protection against the harsh environment of the upper atmosphere and low-Earth orbit, but require dramatically increased cost and utilize commercially out of date architectures and fabrication technologies. If new aerospace systems can take advantage of the latest commodity memories, they can leverage relevant advanced fabrication processes and the economy of scale to control costs. Of course, such systems would require new strategies to maintain appropriate tolerance and/or resilience to faults from the harsh environment. In this work, we observe that single-event effects (SEEs) in recent generation DRAM memories are not entirely random, and in fact are often highly predictable under neutron radiation bombardment. We demonstrate the existence of a small number of weak cells responsible for the vast majority of single-bit, SEEs. Based on this observation, we present a memory fault mapping and tolerance approach called HOTH to mitigate these predictable fault modes in conjunction with more random/unpredictable SEEs in DDR3 memory. In HOTH, both single- and multi-bit effects can be mitigated individually at runtime using a combination of existing errorcorrecting code techniques in Chipkill ECC and a fault map framework. The HOTH fault map is stored in the same DRAM that is subject to SEEs and leverages a fault-tolerance approach to mitigate SEEs that might appear in that part of the storage. Using data from different memory DIMMs, form factors, and radiation incidence angles we show that with HOTH we can improve uncorrectable fault rate by at least ten orders of magnitude and increase mean-time-to-failure to thousands of years, allowing extended service times in harsh environments.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Investigation of Single-Event Upsets in Dynamic Logic Based Flip-Flops
    Nsengiyumva, Patrick
    Yu, Qiaoyan
    2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2015, : 818 - 821
  • [42] Mechanisms of Electron-Induced Single-Event Upsets in Medical and Experimental Linacs
    Tali, Maris
    Alia, Ruben Garcia
    Brugger, Markus
    Ferlet-Cavrois, Veronique
    Corsini, Roberto
    Farabolini, Wilfrid
    Javanainen, Arto
    Kastriotou, Maria
    Kettunen, Heikki
    Santin, Giovanni
    Polo, Cesar Boatella
    Tsiligiannis, Georgios
    Danzeca, Salvatore
    Virtanen, Ari
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2018, 65 (08) : 1715 - 1723
  • [43] Analysis of Single-Event Upsets in a Microsemi ProAsic3E FPGA
    Villa, Paulo R. C.
    Goerl, Roger C.
    Vargas, Fabian
    Poehls, Leticia B.
    Medina, Nilberto H.
    Added, Nemitala
    de Aguiar, Vitor A. P.
    Macchione, Eduardo L. A.
    Aguirre, Fernando
    da Silveira, Marcilei A. G.
    Bezerra, Eduardo A.
    2017 18TH IEEE LATIN AMERICAN TEST SYMPOSIUM (LATS 2017), 2017,
  • [44] Microdosimetry code simulation of charge-deposition spectra, single-event upsets and multiple-bit upsets
    Dyer, CS
    Comber, C
    Truscott, PR
    Sanderson, C
    Underwood, C
    Oldfield, M
    Campbell, A
    Buchner, S
    Meehan, T
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1999, 46 (06) : 1486 - 1493
  • [45] Electron-Induced Single-Event Upsets in Static Random Access Memory
    King, M. P.
    Reed, R. A.
    Weller, R. A.
    Mendenhall, M. H.
    Schrimpf, R. D.
    Sierawski, B. D.
    Sternberg, A. L.
    Narasimham, B.
    Wang, J. K.
    Pitta, E.
    Bartz, B.
    Reed, D.
    Monzel, C.
    Baumann, R. C.
    Deng, X.
    Pellish, J. A.
    Berg, M. D.
    Seidleck, C. M.
    Auden, E. C.
    Weeden-Wright, S. L.
    Gaspard, N. J.
    Zhang, C. X.
    Fleetwood, D. M.
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2013, 60 (06) : 4122 - 4129
  • [46] Role of Elastic Scattering of Protons, Muons, and Electrons in Inducing Single-Event Upsets
    Akkerman, Avraham
    Barak, Joseph
    Yitzhak, Nir M.
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2017, 64 (10) : 2648 - 2660
  • [47] SINGLE-EVENT UPSET (SEU) IN A DRAM WITH ON-CHIP ERROR CORRECTION
    ZOUTENDYK, JA
    SCHWARTZ, HR
    WATSON, RK
    HASNAIN, Z
    NEVILL, LR
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 1987, 34 (06) : 1310 - 1315
  • [48] RHBD techniques for mitigating effects of single-event hits using guard-gates
    Balasubramanian, A
    Bhuva, BL
    Black, JD
    Massengill, LW
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2005, 52 (06) : 2531 - 2535
  • [49] A 10-Transistor 65 nm SRAM Cell Tolerant to Single-Event Upsets
    Li, Yuanqing
    Li, Lixiang
    Ma, Yuan
    Chen, Li
    Liu, Rui
    Wang, Haibin
    Wu, Qiong
    Newton, Michael
    Chen, Mo
    JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2016, 32 (02): : 137 - 145
  • [50] Single-Event Upsets in Photoreceivers for Multi-Gb/s SLHC Data Transmission
    El Nasr-Storey, Sarah Seif
    Detraz, Stephane
    Gui, Ping
    Menouni, Mohsine
    Moreira, Paulo
    Papadopoulos, Spyridon
    Sigaud, Christophe
    Soos, Csaba
    Stejskal, Pavel
    Troska, Jan
    Vasey, Francois
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2011, 58 (06) : 3111 - 3117