harDNNing: a machine-learning-based framework for fault tolerance assessment and protection of DNNs

被引:5
|
作者
Traiola, Marcello [1 ]
Kritikakou, Angeliki [1 ]
Sentieys, Olivier [1 ]
机构
[1] Univ Rennes, CNRS, INRIA, IRISA, Rennes, France
关键词
Reliability Analysis; Fault Tolerance; Machine Learning; Neural Networks;
D O I
10.1109/ETS56758.2023.10174178
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks (DNNs) show promising performance in several application domains, such as robotics, aerospace, smart healthcare, and autonomous driving. Never-theless, DNN results may be incorrect, not only because of the network intrinsic inaccuracy, but also due to faults affecting the hardware. Indeed, hardware faults may impact the DNN inference process and lead to prediction failures. Therefore, ensuring the fault tolerance of DNN is crucial. However, common fault tolerance approaches are not cost-effective for DNNs protection, because of the prohibitive overheads due to the large size of DNNs and of the required memory for parameter storage. In this work, we propose a comprehensive framework to assess the fault tolerance of DNNs and cost-effectively protect them. As a first step, the proposed framework performs datatype-and-layer-based fault injection, driven by the DNN characteristics. As a second step, it uses classification-based machine learning methods in order to predict the criticality, not only of network parameters, but also of their bits. Last, dedicated Error Correction Codes (ECCs) are selectively inserted to protect the critical parameters and bits, hence protecting the DNNs with low cost. Thanks to the proposed framework, we explored and protected two Convolutional Neural Networks (CNNs), each with four different data encoding. The results show that it is possible to protect the critical network parameters with selective ECCs while saving up to 83% memory w.r.t. conventional ECC approaches.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] A machine-learning-guided framework for fault-tolerant DNNs
    Traiola, Marcello
    Kritikakou, Angeliki
    Sentieys, Olivier
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [2] Fault Detection of Induction Motors with Combined Modeling- and Machine-Learning-Based Framework
    Benninger, Moritz
    Liebschner, Marcus
    Kreischer, Christian
    ENERGIES, 2023, 16 (08)
  • [3] A Machine-Learning-Based Framework for Productive Locality Exploitation
    Kayraklioglu, Engin
    Favry, Erwan
    El-Ghazawi, Tarek
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (06) : 1409 - 1424
  • [4] Optimising Machine-Learning-Based Fault Prediction in Foundry Production
    Santos, Igor
    Nieves, Javier
    Penya, Yoseba K.
    Bringas, Pablo G.
    DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS, 2009, 5518 : 554 - 561
  • [5] Protection against failure of machine-learning-based QoT prediction
    Guo, Ningning
    Li, Longfei
    Mukherjee, Biswanath
    Shen, Gangxiang
    JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2022, 14 (07) : 572 - 585
  • [6] Tenet: A Flexible Framework for Machine-Learning-based Vulnerability Detection
    Pinconschi, Eduard
    Reis, Sofia
    Zhang, Chi
    Abreu, Rui
    Erdogmus, Hakan
    Pasareanu, Corina S.
    Jia, Limin
    2023 IEEE/ACM 2ND INTERNATIONAL CONFERENCE ON AI ENGINEERING - SOFTWARE ENGINEERING FOR AI, CAIN, 2023, : 102 - 103
  • [7] A Machine-Learning-Based Framework for Supporting Malware Detection and Analysis
    Cuzzocrea, Alfredo
    Mercaldo, Francesco
    Martinelli, Fabio
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT III, 2021, 12951 : 353 - 365
  • [8] A Machine-Learning-Based Framework for Optimizing the Operation of Future Networks
    Fiandrino, Claudio
    Zhang, Chaoyun
    Patras, Paul
    Banchs, Albert
    Widmer, Joerg
    IEEE COMMUNICATIONS MAGAZINE, 2020, 58 (06) : 20 - 25
  • [9] Machine-learning-based reliability evaluation framework for power distribution networks
    Li, Gengfeng
    Huang, Yuxiong
    Bie, Zhaohong
    Ding, Tao
    IET GENERATION TRANSMISSION & DISTRIBUTION, 2020, 14 (12) : 2282 - 2291
  • [10] A Machine-Learning-Based Epistemic Modeling Framework for Textile Antenna Design
    Kan, Duygu
    Spina, Domenico
    De Ridder, Simon
    Grassi, Flavia
    Rogier, Hendrik
    Vande Ginste, Dries
    IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2019, 18 (11): : 2292 - 2296