Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge

被引：0

作者：

Lu, Yao ^{[1
]}

Rodriguez, Hiram Rayo Torres ^{[1
]}

Vogel, Sebastian ^{[2
]}

van de Waterlaat, Nick ^{[1
]}

Jancura, Pavol ^{[3
]}

机构：

[1] NXP Semicond, Eindhoven, Netherlands

[2] NXP Semiconductors, Munich, Germany

[3] Eindhoven Univ Technol, Eindhoven, Netherlands

来源：

PROCEEDINGS 2023 IEEE/ACM INTERNATIONAL WORKSHOP ON COMPILERS, DEPLOYMENT, AND TOOLING FOR EDGE AI, CODAI 2023 | 2023年

关键词：

D O I：

10.1145/3615338.3618122

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods, do not scale to larger tasks. Consequently, QA-NAS has mostly been limited to low-scale tasks and tiny networks. In this work, we present an approach to enable QA-NAS (INT8 and FB-MP) on large-scale tasks by leveraging the block-wise formulation introduced by block-wise NAS. We demonstrate strong results for the semantic segmentation task on the Cityscapes dataset, finding FB-MP models 33% smaller and INT8 models 17.6% faster than DeepLabV3 (INT8) without compromising task performance.

引用

页码：1 / 5

页数：5

共 50 条

[41] Efficient Constraint-Aware Neural Architecture Search for Object Detection
Poliakov, Egor
Hung, Wei-Jie
Huang, Ching-Chun
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 733 - 740
[42] HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices
Zhou, Ao
Yang, Jianlei
Qi, Yingjie
Qiao, Tong
Shi, Yumeng
Duan, Cenlin
Zhao, Weisheng
Hu, Chunming
IEEE Transactions on Computers, 2024, 73 (12) : 2693 - 2707
[43] Search-Time Efficient Device Constraints-Aware Neural Architecture Search
Dutta, Oshin
Kanvar, Tanu
Agarwal, Sumeet
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 38 - 48
[44] QPA: A Quantization-Aware Piecewise Polynomial Approximation Methodology for Hardware-Efficient Implementations
Geng, Haoran
Chen, Xiaoliang
Zhao, Ning
Du, Yuan
Du, Li
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (07) : 931 - 944
[45] Poster: Scaling Up Deep Neural Network Optimization for Edge Inference
Lu, Bingqian
Yang, Jianyi
Ren, Shaolei
2020 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING (SEC 2020), 2020, : 170 - 172
[46] Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo
Sarno, Francesco
Kumar, Suryansh
Kaya, Berk
Huang, Zhiwu
Ferrari, Vittorio
Van Gool, Luc
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2304 - 2314
[47] Quantization aware approximate multiplier and hardware accelerator for edge computing of deep learning applications
Reddy, K. Manikantta
Vasantha, M. H.
Kumar, Y. B. Nithin
Gopal, Ch. Keshava
Dwivedi, Devesh
INTEGRATION-THE VLSI JOURNAL, 2021, 81 : 268 - 279
[48] DPNAS: Neural Architecture Search for Deep Learning with Differential Privacy
Cheng, Anda
Wang, Jiaxing
Zhang, Xi Sheryl
Chen, Qiang
Wang, Peisong
Cheng, Jian
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6358 - 6366
[49] Efficient Self-learning Evolutionary Neural Architecture Search
Qiu, Zhengzhong
Bi, Wei
Xu, Dong
Guo, Hua
Ge, Hongwei
Liang, Yanchun
Lee, Heow Pueh
Wu, Chunguo
APPLIED SOFT COMPUTING, 2023, 146
[50] Peaches: Personalized Federated Learning With Neural Architecture Search in Edge Computing
Yan, Jiaming
Liu, Jianchun
Xu, Hongli
Wang, Zhiyuan
Qiao, Chunming
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (11) : 10296 - 10312

← 1 2 3 4 5 →