One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

被引:9
|
作者
Lu, Bingqian [1 ]
Yang, Jianyi [1 ]
Jiang, Weiwen [2 ]
Shi, Yiyu [3 ]
Ren, Shaolei [1 ]
机构
[1] Univ Calif Riverside, 900 Univ Ave, Riverside, CA 92521 USA
[2] George Mason Univ, 4400 Univ Dr, Fairfax, VA 22030 USA
[3] Univ Notre Dame, 257 Fitzpatrick Hall, Notre Dame, IN 46556 USA
关键词
Neural Architecture Search; Hardware-Aware; Scalability; AutoML;
D O I
10.1145/3491046
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardwareaware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity - the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.
引用
收藏
页数:34
相关论文
共 50 条
  • [31] SqueezeNext: Hardware-Aware Neural Network Design
    Gholami, Amir
    Kwon, Kiseok
    Wu, Bichen
    Tai, Zizheng
    Yue, Xiangyu
    Jin, Peter
    Zhao, Sicheng
    Keutzer, Kurt
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1719 - 1728
  • [32] Compression-Accuracy Co-Optimization Through Hardware-Aware Neural Architecture Search for Vibration Damage Detection
    Ragusa, Edoardo
    Zonzini, Federica
    De Marchi, Luca
    Zunino, Rodolfo
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (19): : 31745 - 31757
  • [33] TinyOdom: Hardware-Aware Efficient Neural Inertial Navigation
    Saha, Swapnil Sayan
    Sandha, Sandeep Singh
    Garcia, Luis Antonio
    Srivastava, Mani
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2022, 6 (02):
  • [34] Hardware-Aware and Efficient Feature Fusion Network Search
    Guo J.-M.
    Zhang R.
    Zhi T.
    He D.-Y.
    Huang D.
    Chang M.
    Zhang X.-S.
    Guo Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (11): : 2420 - 2432
  • [35] Fine-grained complexity-driven latency predictor in hardware-aware neural architecture search using composite loss
    Lin, Chengmin
    Yang, Pengfei
    Li, Chengcheng
    Cheng, Fei
    Lv, Wenkai
    Wang, Zhenyi
    Wang, Quan
    INFORMATION SCIENCES, 2024, 676
  • [36] Hardware-aware approach to deep neural network optimization
    Li, Hengyi
    Meng, Lin
    NEUROCOMPUTING, 2023, 559
  • [37] Hardware-Aware Softmax Approximation for Deep Neural Networks
    Geng, Xue
    Lin, Jie
    Zhao, Bin
    Kong, Anmin
    Aly, Mohamed M. Sabry
    Chandrasekhar, Vijay
    COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 107 - 122
  • [38] Hardware-Aware Quantization for Multiplierless Neural Network Controllers
    Habermann, Tobias
    Kuehle, Jonas
    Kumm, Martin
    Volkova, Anastasia
    2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 541 - 545
  • [39] HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
    Yu, Zhewen
    Sreeram, Sudarshan
    Agrawal, Krish
    Wu, Junyi
    Montgomerie-Corcoran, Alexander
    Zhang, Cheng
    Cheng, Jianyi
    Bouganis, Christos-Savvas
    Zhao, Yiren
    2024 34TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL 2024, 2024, : 257 - 263
  • [40] Hardware-Aware Automated Neural Minimization for Printed Multilayer Perceptrons
    Kokkinis, Argyris
    Zervakis, Georgios
    Siozios, Kostas
    Tahoori, Mehdi B.
    Henkel, Jorg
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,