Deep Neural Networks;
Split Computing;
Mixed-Precision Quantization;
Neural Architecture Search;
D O I:
10.1109/TrustCom60117.2023.00355
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Deploying large deep neural networks (DNNs) on IoT and mobile devices poses a significant challenge due to hardware resource limitations. To address this challenge, an edge-cloud integration technique, called split computing (SC), is attractive in improving the inference time by splitting a single DNN model into two sub-models to be processed on an edge device and a server. Dynamic split computing (DSC) is a further emerging technique in SC to dynamically determine the split point depending on the communication conditions. In this work, we propose a DNN architecture optimization method for DSC. Our contributions are twofold. (1) First, we develop a DSC-aware mixed-precision quantization method exploiting neural architecture search (NAS). By NAS, we efficiently explore the optimal bitwidth of each layer from a huge design space to construct potential split points in the target DNN - with the more potential split points, the DNN architecture can more flexibly utilize one split point depending on the communication conditions. (2) Also, in order to improve the end-to-end inference time, we propose a new bitwidth-wise DSC (BW-DSC) algorithm to dynamically determine the optimal split point among the potential split points in the mixed-precision quantized DNN architecture. Our evaluation demonstrated that our work provides more effective split points than existing works while mitigating the inference accuracy degradation. Specifically in terms of the end-to-end inference time, our work achieved an average of 16.47% and up to 24.36% improvement compared with a state-of-the-art work.
机构:
Texas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USATexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA
Ale, Laha
Zhang, Ning
论文数: 0引用数: 0
h-index: 0
机构:
Univ Windsor, Dept Elect & Comp Engn, Windsor, ON N9B 3P4, CanadaTexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA
Zhang, Ning
Fang, Xiaojie
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Commun Res Ctr, Harbin 150001, Peoples R ChinaTexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA
Fang, Xiaojie
Chen, Xianfu
论文数: 0引用数: 0
h-index: 0
机构:
VTT Tech Res Ctr Finland, Dept Commun Syst, Oulu 90570, FinlandTexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA
Chen, Xianfu
Wu, Shaohua
论文数: 0引用数: 0
h-index: 0
机构:
Harbin Inst Technol, Commun Res Ctr, Harbin 150001, Peoples R China
Network Commun Res Ctr, Peng Cheng Lab, Shenzhen 518052, Peoples R ChinaTexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA
Wu, Shaohua
Li, Longzhuang
论文数: 0引用数: 0
h-index: 0
机构:
Texas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USATexas A&M Univ, Dept Comp Sci, Corpus Christi, TX 78412 USA