Soft Error Resilience at Near-Zero Cost

被引:1
|
作者
Zeng, Jianping [1 ]
Huang, Shao-Yu [1 ]
Liu, Jiuyang [2 ]
Jung, Changhee [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
来源
PROCEEDINGS OF THE 38TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2024 | 2024年
关键词
soft error resilience; compiler; computer architecture; OPTIMIZATIONS;
D O I
10.1145/3650200.3656605
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among existing schemes for soft error resilience, acoustic-sensor-based detection stands out owing to its ability to prevent silent data corruption at low hardware cost. However, the state-of-the-art work not only incurs a considerable run-time overhead but also complicates the processor pipeline with intrusive micro-architectural modifications, hindering its practical deployment in real silicon. To this end, this paper presents VeriPipe, a near-zero-cost compiler/architecture codesign scheme for soft error resilience. VeriPipe compiler partitions input program to a series of regions (epochs) statically, while VeriPipe hardware verifies if they are error-free dynamically. In particular, VeriPipe achieves a simple yet efficient region-level verification where each region goes through a three-stage (Execute, Verify, and Commit) verification pipeline to ensure the absence of soft errors before proceeding to the next region. In particular, VeriPipe hardware overlaps the Verify stage of each region with the Execute stage of the next region, thereby effectively hiding the Verify delay. Experiments with 43 applications from SPEC2006/2017/NPB-CPP/SPLASH3/DoE Mini-Apps highlight the negligible overheads of VeriPipe, i.e., an average of 1% run-time overhead and a storage overhead of only 3 registers and 1 countdown timer.
引用
收藏
页码:176 / 187
页数:12
相关论文
共 50 条
  • [1] Entanglement Detection with Near-Zero Cost
    Westrick, Sam
    Arora, Jatin
    Acar, Umut A.
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2022, 6 (ICFP):
  • [2] Concurrent error recovery with near-zero latency in synthesized ASICs
    Hamilton, SN
    Orailoglu, A
    DESIGN, AUTOMATION AND TEST IN EUROPE, PROCEEDINGS, 1998, : 604 - 609
  • [3] Near-zero tolerance
    不详
    KUNSTSTOFFE-PLAST EUROPE, 2003, 93 (11): : 64 - +
  • [4] μ-near-zero supercoupling
    Marcos, Joao S.
    Silveirinha, Mario G.
    Engheta, Nader
    PHYSICAL REVIEW B, 2015, 91 (19):
  • [5] Design of Near-Zero Refractive Index Metamaterials using ε and μ Near-Zero Media
    Soemphol, C.
    Wongkasem, N.
    PROCEEDINGS OF THE 2012 INTERNATIONAL WORKSHOP ON METAMATERIALS (META), 2012,
  • [6] Pursuing Near-Zero Response
    Engheta, Nader
    SCIENCE, 2013, 340 (6130) : 286 - 287
  • [7] Approximate Integer and Floating-Point Dividers with Near-Zero Error Bias
    Saadat, Hassaan
    Javaid, Haris
    Parameswaran, Sri
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [8] Near-Zero Downtime Recovery From Transient-Error-Induced Crashes
    Chen, Chao
    Eisenhauer, Greg
    Pande, Santosh
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 765 - 778
  • [9] |ε|-Near-zero materials in the near-infrared
    Ciattoni, Alessandro
    Marinelli, Rino
    Rizza, Carlo
    Palange, Elia
    APPLIED PHYSICS B-LASERS AND OPTICS, 2013, 110 (01): : 23 - 26
  • [10] (Ni,Fe,Co)-based nanocrystalline soft magnets with near-zero magnetostriction
    Willard, MA
    Claassen, JC
    Stroud, RM
    Francavilla, TL
    Harris, VG
    IEEE TRANSACTIONS ON MAGNETICS, 2002, 38 (05) : 3045 - 3050