Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge

被引:0
|
作者
Lu, Yao [1 ]
Rodriguez, Hiram Rayo Torres [1 ]
Vogel, Sebastian [2 ]
van de Waterlaat, Nick [1 ]
Jancura, Pavol [3 ]
机构
[1] NXP Semicond, Eindhoven, Netherlands
[2] NXP Semiconductors, Munich, Germany
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
D O I
10.1145/3615338.3618122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods, do not scale to larger tasks. Consequently, QA-NAS has mostly been limited to low-scale tasks and tiny networks. In this work, we present an approach to enable QA-NAS (INT8 and FB-MP) on large-scale tasks by leveraging the block-wise formulation introduced by block-wise NAS. We demonstrate strong results for the semantic segmentation task on the Cityscapes dataset, finding FB-MP models 33% smaller and INT8 models 17.6% faster than DeepLabV3 (INT8) without compromising task performance.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [1] A Unified Efficient Deep Learning Architecture for Rapid Safety Objects Classification Using Normalized Quantization-Aware Learning
    Stephen, Okeke
    Nguyen, Minh
    SENSORS, 2023, 23 (21)
  • [2] Quantization-Aware Neural Architecture Search with Hyperparameter Optimization for Industrial Predictive Maintenance Applications
    van de Waterlaat, Nick
    Vogel, Sebastian
    Rodriguez, Hiram Rayo Torres
    Sanberg, Willem
    Daalderop, Gerardo
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [3] On-Device Image Classification with Proxyless Neural Architecture Search and Quantization-Aware Fine-tuning
    Cai, Han
    Wang, Tianzhe
    Wu, Zhanghao
    Wang, Kuan
    Lin, Ji
    Han, Song
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2509 - 2513
  • [4] Phase-limited quantization-aware training for diffractive deep neural networks
    Wang, Yu
    Sha, Qi
    Qi, Feng
    APPLIED OPTICS, 2025, 64 (06) : 1413 - 1419
  • [5] HADAS: Hardware-Aware Dynamic Neural Architecture Search for Edge Performance Scaling
    Bouzidi, Halima
    Odema, Mohanad
    Ouarnoughi, Hamza
    Al Faruque, Mohammad Abdullah
    Niar, Smail
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [6] Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
    Li, Zhengang
    Lu, Alec
    Xie, Yanyue
    Kong, Zhenglun
    Sun, Mengshu
    Tang, Hao
    Xue, Zhong Jia
    Dong, Peiyan
    Ding, Caiwen
    Wang, Yanzhi
    Lin, Xue
    Fang, Zhenman
    PROCEEDINGS OF THE 38TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2024, 2024, : 324 - 337
  • [7] Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
    Shen, Mingzhu
    Liang, Feng
    Gong, Ruihao
    Li, Yuhang
    Li, Chuming
    Lin, Chen
    Yu, Fengwei
    Yan, Junjie
    Ouyang, Wanli
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5320 - 5329
  • [8] Design Environment of Quantization-Aware Edge AI Hardware for Few-Shot Learning
    Kanda, R.
    Onizawa, N.
    Leonardon, M.
    Gripon, V
    Hanyu, T.
    2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, : 928 - 931
  • [9] Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference
    Hawks, Benjamin
    Duarte, Javier
    Fraser, Nicholas J.
    Pappalardo, Alessandro
    Nhan Tran
    Umuroglu, Yaman
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [10] Quantization-Aware Federated Learning with Coarsely Quantized Measurements
    Danaee, Alireza
    de Lamare, Rodrigo C.
    Nascimento, Vitor H.
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1691 - 1695