Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge

被引:0
|
作者
Lu, Yao [1 ]
Rodriguez, Hiram Rayo Torres [1 ]
Vogel, Sebastian [2 ]
van de Waterlaat, Nick [1 ]
Jancura, Pavol [3 ]
机构
[1] NXP Semicond, Eindhoven, Netherlands
[2] NXP Semiconductors, Munich, Germany
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
D O I
10.1145/3615338.3618122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods, do not scale to larger tasks. Consequently, QA-NAS has mostly been limited to low-scale tasks and tiny networks. In this work, we present an approach to enable QA-NAS (INT8 and FB-MP) on large-scale tasks by leveraging the block-wise formulation introduced by block-wise NAS. We demonstrate strong results for the semantic segmentation task on the Cityscapes dataset, finding FB-MP models 33% smaller and INT8 models 17.6% faster than DeepLabV3 (INT8) without compromising task performance.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [21] SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks
    Venkatesh, Sreyes
    Marinescu, Razvan
    Eshraghian, Jason K.
    2024 NEURO INSPIRED COMPUTATIONAL ELEMENTS CONFERENCE, NICE, 2024,
  • [22] Quantization-Aware Training of Spiking Neural Networks for Energy-Efficient Spectrum Sensing on Loihi Chip
    Liu, Shiya
    Mohammadi, Nima
    Yi, Yang
    IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2024, 8 (02): : 827 - 838
  • [23] Evolution of Hardware-Aware Neural Architecture Search on the Edge
    Richey, Blake
    Clay, Mitchell
    Grecos, Christos
    Shirvaikar, Mukul
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2023, 2023, 12528
  • [24] Efficient Architecture Search for Deep Neural Networks
    Gottapu, Ram Deepak
    Dagli, Cihan H.
    COMPLEX ADAPTIVE SYSTEMS, 2020, 168 : 19 - 25
  • [25] Mixed-precision quantization-aware training for photonic neural networks
    Kirtas, Manos
    Passalis, Nikolaos
    Oikonomou, Athina
    Moralis-Pegios, Miltos
    Giamougiannis, George
    Tsakyridis, Apostolos
    Mourgias-Alexandris, George
    Pleros, Nikolaos
    Tefas, Anastasios
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
  • [26] Mixed-precision quantization-aware training for photonic neural networks
    Manos Kirtas
    Nikolaos Passalis
    Athina Oikonomou
    Miltos Moralis-Pegios
    George Giamougiannis
    Apostolos Tsakyridis
    George Mourgias-Alexandris
    Nikolaos Pleros
    Anastasios Tefas
    Neural Computing and Applications, 2023, 35 : 21361 - 21379
  • [27] Deep Active Learning with a Neural Architecture Search
    Geifman, Yonatan
    El-Yaniv, Ran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [28] FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism
    Yang, Je
    Hong, Seongmin
    Kim, Joo-Young
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 259 - 264
  • [29] Communication-efficient ADMM using quantization-aware Gaussian process regression
    Duarte, Aldo
    Nghiem, Truong X.
    Wei, Shuangqing
    EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2024, 12
  • [30] Non-Serial Quantization-Aware Deep Optics for Snapshot Hyperspectral Imaging
    Wang, Lizhi
    Li, Lingen
    Song, Weitao
    Zhang, Lei
    Xiong, Zhiwei
    Huang, Hua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 6993 - 7010