Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge

被引：0

作者：

Lu, Yao ^{[1
]}

Rodriguez, Hiram Rayo Torres ^{[1
]}

Vogel, Sebastian ^{[2
]}

van de Waterlaat, Nick ^{[1
]}

Jancura, Pavol ^{[3
]}

机构：

[1] NXP Semicond, Eindhoven, Netherlands

[2] NXP Semiconductors, Munich, Germany

[3] Eindhoven Univ Technol, Eindhoven, Netherlands

来源：

PROCEEDINGS 2023 IEEE/ACM INTERNATIONAL WORKSHOP ON COMPILERS, DEPLOYMENT, AND TOOLING FOR EDGE AI, CODAI 2023 | 2023年

关键词：

D O I：

10.1145/3615338.3618122

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural Architecture Search (NAS) has become the de-facto approach for designing accurate and efficient networks for edge devices. Since models are typically quantized for edge deployment, recent work has investigated quantization-aware NAS (QA-NAS) to search for highly accurate and efficient quantized models. However, existing QA-NAS approaches, particularly few-bit mixed-precision (FB-MP) methods, do not scale to larger tasks. Consequently, QA-NAS has mostly been limited to low-scale tasks and tiny networks. In this work, we present an approach to enable QA-NAS (INT8 and FB-MP) on large-scale tasks by leveraging the block-wise formulation introduced by block-wise NAS. We demonstrate strong results for the semantic segmentation task on the Cityscapes dataset, finding FB-MP models 33% smaller and INT8 models 17.6% faster than DeepLabV3 (INT8) without compromising task performance.

引用

页码：1 / 5

页数：5

共 50 条

[21] SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks
Venkatesh, Sreyes
Marinescu, Razvan
Eshraghian, Jason K.
2024 NEURO INSPIRED COMPUTATIONAL ELEMENTS CONFERENCE, NICE, 2024,
[22] Quantization-Aware Training of Spiking Neural Networks for Energy-Efficient Spectrum Sensing on Loihi Chip
Liu, Shiya
Mohammadi, Nima
Yi, Yang
IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2024, 8 (02): : 827 - 838
[23] Evolution of Hardware-Aware Neural Architecture Search on the Edge
Richey, Blake
Clay, Mitchell
Grecos, Christos
Shirvaikar, Mukul
REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2023, 2023, 12528
[24] Efficient Architecture Search for Deep Neural Networks
Gottapu, Ram Deepak
Dagli, Cihan H.
COMPLEX ADAPTIVE SYSTEMS, 2020, 168 : 19 - 25
[25] Mixed-precision quantization-aware training for photonic neural networks
Kirtas, Manos
Passalis, Nikolaos
Oikonomou, Athina
Moralis-Pegios, Miltos
Giamougiannis, George
Tsakyridis, Apostolos
Mourgias-Alexandris, George
Pleros, Nikolaos
Tefas, Anastasios
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 21361 - 21379
[26] Mixed-precision quantization-aware training for photonic neural networks
Manos Kirtas
Nikolaos Passalis
Athina Oikonomou
Miltos Moralis-Pegios
George Giamougiannis
Apostolos Tsakyridis
George Mourgias-Alexandris
Nikolaos Pleros
Anastasios Tefas
Neural Computing and Applications, 2023, 35 : 21361 - 21379
[27] Deep Active Learning with a Neural Architecture Search
Geifman, Yonatan
El-Yaniv, Ran
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[28] FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism
Yang, Je
Hong, Seongmin
Kim, Joo-Young
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 259 - 264
[29] Communication-efficient ADMM using quantization-aware Gaussian process regression
Duarte, Aldo
Nghiem, Truong X.
Wei, Shuangqing
EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2024, 12
[30] Non-Serial Quantization-Aware Deep Optics for Snapshot Hyperspectral Imaging
Wang, Lizhi
Li, Lingen
Song, Weitao
Zhang, Lei
Xiong, Zhiwei
Huang, Hua
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 6993 - 7010

← 1 2 3 4 5 →