A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems

被引:6
|
作者
Dang, Khanh N. [1 ]
Ahmed, Akram Ben [2 ]
Abdallah, Abderazek Ben [3 ]
Tran, Xuan-Tu [1 ]
机构
[1] Vietnam Natl Univ, VNU Univ Engn & Technol VNU UET, Hanoi VNU, VNU Key Lab Smart Integrated Syst SISLAB, Hanoi 123106, Vietnam
[2] Natl Inst Adv Ind Sci & Technol, Tsukuba, Ibaraki 3058568, Japan
[3] Univ Aizu, Adapt Syst Lab, Aizu Wakamatsu 9658580, Japan
关键词
Through-silicon vias; Redundancy; Circuit faults; Testing; Fault tolerant systems; Fault-tolerance; fault detection; parity check; through silicon via; real-time; thermal aware; THROUGH-SILICON; 3D; ARCHITECTURE; DEFECTS; REPAIR; CODES;
D O I
10.1109/ACCESS.2020.3022904
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Through-Silicon-Via (TSV) based 3D Integrated Circuits (3D-IC) are one of the most advanced architectures by providing low power consumption, shorter wire length and smaller footprint. However, 3D-ICs confront lifetime reliability due to high operating temperature and interconnect reliability, especially the Through-Silicon-Via (TSV), which can significantly affect the accuracy of the applications. In this paper, we present an online method that supports the detection and correction of lifetime TSV failures, named IaSiG. By reusing the conventional recovery method and analyzing the output syndromes, IaSiG can determine and correct the defective TSVs. Results show that within a group, R redundant TSVs can fully localize and correct R defects and support the detection of R + 1 defects. Moreover, by using G groups, it can localize up to G x R and detect up to G x (R + 1) defects. An implementation of IaSiG for 32-bit data in eight groups and two redundancies has a worst-case execution time (WCET) of 5,152 cycles while supporting at most 16 defective TSVs (50% localization). By integrating IaSiG onto a 3D Network-on-Chip, we also perform a grid-search based empirical method to insert suitable numbers of redundancies into TSV groups. The empirical method takes the operating temperature as the factor of accelerated fault due to the fact that temperature is one of the major issues of 3D-ICs. The results show that the proposed method can reduce the number of redundancies from the uniform method while still maintaining the required Mean Time to Failure.
引用
收藏
页码:166642 / 166657
页数:16
相关论文
共 50 条
  • [21] On-line Thermal-aware Task Management for Three-dimensional Dynamically Partially Reconfigurable Systems
    Wang, Yen-Wen
    Chen, Ya-Shu
    2013 IEEE 19TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA), 2013, : 111 - 120
  • [22] Securet3d: An Adaptive, Secure, and Fault-Tolerant Aware Routing Algorithm for Vertically-Partially Connected 3D-NoC
    da Silva, Alexandre Almeida
    Nogueira, Lucas
    Coelho, Alexandre
    Silveira, Jarbas A. N.
    Marcon, Cesar
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2025, 33 (01) : 275 - 287
  • [23] Broadcast-TDMA: A Cost-Effective Fault-Tolerance Method for TSV Lifetime Reliability Enhancement
    Ni, Tianming
    Bian, Jingchang
    Yang, Zhao
    Nie, Mu
    Yao, Liang
    Huang, Zhengfeng
    Yan, Aibin
    Wen, Xiaoqing
    IEEE DESIGN & TEST, 2022, 39 (05) : 34 - 42
  • [24] LSTM-based Temperature Prediction and Hotspot Tracking for Thermal-aware 3D NoC System
    Cheng, Tong
    Du, Haoyu
    Li, Li
    Fu, Yuxiang
    18TH INTERNATIONAL SOC DESIGN CONFERENCE 2021 (ISOCC 2021), 2021, : 286 - 287
  • [25] High Performance Virtual Channel Based Fully Adaptive Thermal-aware Routing for 3D NoC
    Jiang, Xin
    Lei, Xiangyang
    Zeng, Lian
    Watanabe, Takahiro
    PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2017, : 289 - 295
  • [26] Cluster-Based Thermal-Aware Mapping for 3D-NoC-Based NASH Neuromorphic System
    Maatar, Mohamed
    Dang, Khanh N.
    Ben Abdallah, Abderazek
    2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND EMERGENT TECHNOLOGIES, ICASET 2024, 2024,
  • [27] Adaptive fault-tolerant architecture and routing algorithm for reliable many-core 3D-NoC systems
    Ben Ahmed, Akram
    Ben Abdallah, Abderazek
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 93-94 : 30 - 43
  • [28] Dynamic Programming-Based Runtime Thermal Management (DPRTM): An Online Thermal Control Strategy for 3D-NoC Systems
    Al-Dujaily, Ra'ed
    Dahir, Nizar
    Mak, Terrence
    Xia, Fei
    Yakovlev, Alex
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2013, 19 (01)
  • [29] Thermal-aware, heterogeneous materials for improved energy and reliability in 3D PCM architectures
    Saadeldeen, Heba
    Deng, Zhaoxia
    Sherwood, Timothy
    Chong, Frederic T.
    MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2017, : 223 - 236
  • [30] TSV-driven 3D ICs: An innovative thermal-aware and stress-reliable design strategy
    2016 IEEE INTERNATIONAL 3D SYSTEMS INTEGRATION CONFERENCE (3DIC), 2016,