Fault-tolerant deep learning inference on CPU-GPU integrated edge devices with TEEs

被引：0

作者：

Xu, Hongjian ^{[1
]}

Liao, Longlong ^{[2
,3
]}

Liu, Xinqi ^{[4
]}

Chen, Shuguang ^{[3
]}

Chen, Jianguo ^{[5
]}

Liang, Zhixuan ^{[6
]}

Yu, Yuanlong ^{[1
]}

机构：

[1] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou 350100, Peoples R China

[2] Fuzhou Univ, Fuzhou 350100, Peoples R China

[3] Univ Hong Kong, Hong Kong 999077, Peoples R China

[4] Univ Hong Kong, Dept Civil Engn, Hong Kong 999077, Peoples R China

[5] Sun Yat Sen Univ, Sch Software Engn, Zhuhai 519082, Peoples R China

[6] Hong Kong Polytech Univ, Comp Sci & Technol, Hong Kong 999077, Peoples R China

来源：

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2024年 / 161卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Fault-tolerant inference; Fault injection attack; CPU-GPU integrated edge device; Trusted Execution Environment;

D O I：

10.1016/j.future.2024.07.027

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

CPU-GPU integrated edge devices and deep learning algorithms have received significant progress in recent years, leading to increasingly widespread application of edge intelligence. However, deep learning inference on these edge devices is vulnerable to Fault Injection Attacks (FIAs) that can modify device memory or execute instructions with errors. We propose DarkneTF, a Fault-Tolerant (FT) deep learning inference framework for CPU-GPU integrated edge devices, to ensure the correctness of model inference results by detecting the threat of FIAs. DarkneTF introduces algorithm-based verification to implement the FT deep learning inference. The verification process involves verifying the integrity of model weights and validating the correctness of time- intensive calculations, such as convolutions. We improve the Freivalds algorithm to enhance the ability to detect tiny perturbations by strengthening randomization. As the verification process is also susceptible to FIAs, DarkneTF offloads the verification process into Trusted Execution Environments (TEEs). This scheme ensures the verification process's security and allows for accelerated model inference using the integrated GPUs. Experimental results show that GPU-accelerated FT inference on HiKey 960 achieves notable speedups ranging from 3.46x to 5.57x compared to FT inference on a standalone CPU. The extra memory overhead incurred FT inference remains at an exceedingly low level, with a range of 0.46% to 10.22%. The round-off error of the improved Freivalds algorithm is below 2.50 . 50 x 10 -4 , and the accuracy of detecting FIAs is above 92.73%.

引用

页码：404 / 414

页数：11

共 50 条

[1] A collaborative CPU-GPU approach for deep learning on mobile devices
Valery, Olivier
Liu, Pangfeng
Wu, Jan-Jan
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):
[2] Demystifying the TensorFlow Eager Execution of Deep Learning Inference on a CPU-GPU Tandem
Delestrac, Paul
Torres, Lionel
Novo, David
2022 25TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2022, : 446 - 455
[3] The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures
Vasiliadis, Giorgos
Tsirbas, Rafail
Ioannidis, Sotiris
2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 55 - 64
[4] Exploring Query Processing on CPU-GPU Integrated Edge Device
Liu, Jiesong
Zhang, Feng
Li, Hourun
Wang, Dalin
Wan, Weitao
Fang, Xiaokun
Zhai, Jidong
Du, Xiaoyong
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4057 - 4070
[5] DNN Model Architecture Fingerprinting Attack on CPU-GPU Edge Devices
Patwari, Kartik
Hafiz, Syed Mahbub
Wang, Han
Homayoun, Houman
Shafiq, Zubair
Chuah, Chen-Nee
2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 337 - 355
[6] iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures
Zhang, Chenyang
Zhang, Feng
Guo, Xiaoguang
He, Bingsheng
Zhang, Xiao
Du, Xiaoyong
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1740 - 1752
[7] Deep learning based data prefetching in CPU-GPU unified virtual memory
Long, Xinjian
Gong, Xiangyang
Zhang, Bo
Zhou, Huiyang
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2023, 174 : 19 - 31
[8] Fault-Tolerant Deep Learning Using Regularization
Joardar, Biresh Kumar
Arka, Aqeeb Iqbal
Doppa, Janardhan Rao
Pande, Partha Pratim
2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
[9] Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices
Angel Morell, Jose
Alba, Enrique
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 133 : 53 - 67
[10] CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system
Zhang, Qi
Liu, Yi
Liu, Tao
Qian, Depei
JOURNAL OF SUPERCOMPUTING, 2023, 79 (13): : 14172 - 14199

← 1 2 3 4 5 →