TRP: Trained Rank Pruning for Efficient Deep Neural Networks

被引：0

作者：

Xu, Yuhui ^{[1
]}

Li, Yuxi ^{[1
]}

Zhang, Shuai ^{[2
]}

Wen, Wei ^{[3
]}

Wang, Botao ^{[2
]}

Qi, Yingyong ^{[2
]}

Chen, Yiran ^{[3
]}

Lin, Weiyao ^{[1
]}

Xiong, Hongkai ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[2] Qualcomm AI Res, San Diego, CA USA

[3] Duke Univ, Durham, NC 27706 USA

来源：

PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation.

引用

页码：977 / 983

页数：7

共 50 条

[41] Fast and Efficient Deep Sparse Multi-Strength Spiking Neural Networks with Dynamic Pruning
Chen, Ruizhi
Ma, Hong
Xie, Shaolin
Guo, Peng
Li, Pin
Wang, Donglin
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 412 - 419
[42] MIXP: Efficient Deep Neural Networks Pruning for Further FLOPs Compression via Neuron Bond
Hu, Bin
Zhao, Tianming
Xie, Yucheng
Wang, Yan
Guo, Xiaonan
Cheng, Jerry
Chen, Yingying
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[43] Pruning feature maps for efficient convolutional neural networks
Guo, Xiao-ting
Xie, Xin-shu
Lang, Xun
OPTIK, 2023, 281
[44] Class-Aware Pruning for Efficient Neural Networks
Jiang, Mengnan
Wang, Jingcun
Eldebiky, Amro
Yin, Xunzhao
Zhuo, Cheng
Lin, Ing-Chao
Li Zhang, Grace
2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
[45] An Efficient End-to-End Channel Level Pruning Method for Deep Neural Networks Compression
Zeng, Lei
Chen, Shi
Zeng, Sen
PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 43 - 46
[46] ACP: ADAPTIVE CHANNEL PRUNING FOR EFFICIENT NEURAL NETWORKS
Zhang, Yuan
Yuan, Yuan
Wang, Qi
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4488 - 4492
[47] QLP: Deep Q-Learning for Pruning Deep Neural Networks
Camci, Efe
Gupta, Manas
Wu, Min
Lin, Jie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6488 - 6501
[48] Distilled Neural Networks for Efficient Learning to Rank
Nardini, Franco Maria
Rulli, Cosimo
Trani, Salvatore
Venturini, Rossano
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4695 - 4712
[49] Neural Dynamics Pruning for Energy-Efficient Spiking Neural Networks
Huang, Haoyu
He, Linxuan
Liu, Faqiang
Zhao, Rong
Shi, Luping
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
[50] Colour Visual Coding in trained Deep Neural Networks
Rafegas, Ivet
Vanrell, Maria
PERCEPTION, 2016, 45 : 214 - 214

← 1 2 3 4 5 →