Fault-tolerant deep learning inference on CPU-GPU integrated edge devices with TEEs

被引:0
|
作者
Xu, Hongjian [1 ]
Liao, Longlong [2 ,3 ]
Liu, Xinqi [4 ]
Chen, Shuguang [3 ]
Chen, Jianguo [5 ]
Liang, Zhixuan [6 ]
Yu, Yuanlong [1 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou 350100, Peoples R China
[2] Fuzhou Univ, Fuzhou 350100, Peoples R China
[3] Univ Hong Kong, Hong Kong 999077, Peoples R China
[4] Univ Hong Kong, Dept Civil Engn, Hong Kong 999077, Peoples R China
[5] Sun Yat Sen Univ, Sch Software Engn, Zhuhai 519082, Peoples R China
[6] Hong Kong Polytech Univ, Comp Sci & Technol, Hong Kong 999077, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Fault-tolerant inference; Fault injection attack; CPU-GPU integrated edge device; Trusted Execution Environment;
D O I
10.1016/j.future.2024.07.027
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
CPU-GPU integrated edge devices and deep learning algorithms have received significant progress in recent years, leading to increasingly widespread application of edge intelligence. However, deep learning inference on these edge devices is vulnerable to Fault Injection Attacks (FIAs) that can modify device memory or execute instructions with errors. We propose DarkneTF, a Fault-Tolerant (FT) deep learning inference framework for CPU-GPU integrated edge devices, to ensure the correctness of model inference results by detecting the threat of FIAs. DarkneTF introduces algorithm-based verification to implement the FT deep learning inference. The verification process involves verifying the integrity of model weights and validating the correctness of time- intensive calculations, such as convolutions. We improve the Freivalds algorithm to enhance the ability to detect tiny perturbations by strengthening randomization. As the verification process is also susceptible to FIAs, DarkneTF offloads the verification process into Trusted Execution Environments (TEEs). This scheme ensures the verification process's security and allows for accelerated model inference using the integrated GPUs. Experimental results show that GPU-accelerated FT inference on HiKey 960 achieves notable speedups ranging from 3.46x to 5.57x compared to FT inference on a standalone CPU. The extra memory overhead incurred FT inference remains at an exceedingly low level, with a range of 0.46% to 10.22%. The round-off error of the improved Freivalds algorithm is below 2.50 . 50 x 10 -4 , and the accuracy of detecting FIAs is above 92.73%.
引用
收藏
页码:404 / 414
页数:11
相关论文
共 50 条
  • [1] A collaborative CPU-GPU approach for deep learning on mobile devices
    Valery, Olivier
    Liu, Pangfeng
    Wu, Jan-Jan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):
  • [2] Demystifying the TensorFlow Eager Execution of Deep Learning Inference on a CPU-GPU Tandem
    Delestrac, Paul
    Torres, Lionel
    Novo, David
    2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 446 - 455
  • [3] The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures
    Vasiliadis, Giorgos
    Tsirbas, Rafail
    Ioannidis, Sotiris
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 55 - 64
  • [4] Exploring Query Processing on CPU-GPU Integrated Edge Device
    Liu, Jiesong
    Zhang, Feng
    Li, Hourun
    Wang, Dalin
    Wan, Weitao
    Fang, Xiaokun
    Zhai, Jidong
    Du, Xiaoyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4057 - 4070
  • [5] DNN Model Architecture Fingerprinting Attack on CPU-GPU Edge Devices
    Patwari, Kartik
    Hafiz, Syed Mahbub
    Wang, Han
    Homayoun, Houman
    Shafiq, Zubair
    Chuah, Chen-Nee
    2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 337 - 355
  • [6] iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures
    Zhang, Chenyang
    Zhang, Feng
    Guo, Xiaoguang
    He, Bingsheng
    Zhang, Xiao
    Du, Xiaoyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1740 - 1752
  • [7] Deep learning based data prefetching in CPU-GPU unified virtual memory
    Long, Xinjian
    Gong, Xiangyang
    Zhang, Bo
    Zhou, Huiyang
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2023, 174 : 19 - 31
  • [8] Fault-Tolerant Deep Learning Using Regularization
    Joardar, Biresh Kumar
    Arka, Aqeeb Iqbal
    Doppa, Janardhan Rao
    Pande, Partha Pratim
    2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [9] Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices
    Angel Morell, Jose
    Alba, Enrique
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 133 : 53 - 67
  • [10] CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system
    Zhang, Qi
    Liu, Yi
    Liu, Tao
    Qian, Depei
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (13): : 14172 - 14199