Weight Evolution: Improving Deep Neural Networks Training through Evolving InferiorWeight Values

被引：1

作者：

Lin, Zhenquan ^{[1
]}

Guo, Kailing ^{[1
]}

Xing, Xiaofen ^{[2
]}

Xu, Xiangmin ^{[3
]}

机构：

[1] South China Univ Technol, Guangzhou, Peoples R China

[2] South China Univ Technol, UBTECH SCUT Union Lab, Guangzhou, Peoples R China

[3] South China Univ Technol, Inst Modern Ind Technol, Zhongshan, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

中国国家自然科学基金;

关键词：

weight evolution; neural networks; training method;

D O I：

10.1145/3474085.3475376

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To obtain good performance, convolutional neural networks are usually over-parameterized. This phenomenon has stimulated two interesting topics: pruning the unimportant weights for compression and reactivating the unimportant weights to make full use of network capability. However, current weight reactivation methods usually reactivate the entire filters, which may not be precise enough. Looking back in history, the prosperity of filter pruning is mainly due to its friendliness to hardware implementation, but pruning at a finer structure level, i.e., weight elements, usually leads to better network performance. We study the problem of weight element reactivation in this paper. Motivated by evolution, we select the unimportant filters and update their unimportant elements by combining them with the important elements of important filters, just like gene crossover to produce better offspring, and the proposed method is called weight evolution (WE). WE is mainly composed of four strategies. We propose a global selection strategy and a local selection strategy and combine them to locate the unimportant filters. A forward matching strategy is proposed to find the matched important filters and a crossover strategy is proposed to utilize the important elements of the important filters for updating unimportant filters. WE is plug-in to existing network architectures. Comprehensive experiments show that WE outperforms the other reactivation methods and plug-in training methods with typical convolutional neural networks, especially lightweight networks. Our code is available at https://github.com/BZQLin/Weight-evolution.

引用

页码：2176 / 2184

页数：9

共 50 条

[41] Is normalization indispensable for training deep neural networks?
Shao, Jie
Hu, Kai
Wang, Changhu
Xue, Xiangyang
Raj, Bhiksha
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[42] Context-aware Route Recommendation with Weight Learning through Deep Neural Networks
Jia, Huiwen
Fang, Jun
Tan, Naiqiang
Liu, Xinyue
Huo, Zengwei
Ma, Nan
Wu, Guobin
Chai, Hua
Qie, Xiaohu
Zhang, Bo
Yin, Yafeng
Shen, Siqian
2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 4040 - 4045
[43] On Calibration of Mixup Training for Deep Neural Networks
Maronas, Juan
Ramos, Daniel
Paredes, Roberto
STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 67 - 76
[44] Exploiting Invariance in Training Deep Neural Networks
Ye, Chengxi
Zhou, Xiong
McKinney, Tristan
Liu, Yanfeng
Zhou, Qinggang
Zhdanov, Fedor
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8849 - 8856
[45] Exploring strategies for training deep neural networks
Larochelle, Hugo
Bengio, Yoshua
Louradour, Jérôme
Lamblin, Pascal
Journal of Machine Learning Research, 2009, 10 : 1 - 40
[46] Training data enhancements for improving colonic polyp detection using deep convolutional neural networks
Thomaz, Victor de Almeida
Sierra-Franco, Cesar A.
Raposo, Alberto B.
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111 (111)
[47] Training Deep Neural Networks with Gradual Deconvexification
Lo, Jawes Ting-Ho
Gui, Yichuan
Peng, Yun
2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1000 - 1007
[48] Training Deep Neural Networks for Visual Servoing
Bateux, Quentin
Marchand, Eric
Leitner, Jurgen
Chaumette, Francois
Corke, Peter
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 3307 - 3314
[49] Local Critic Training of Deep Neural Networks
Lee, Hojung
Lee, Jong-Seok
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[50] An Optimization Strategy for Deep Neural Networks Training
Wu, Tingting
Zeng, Peng
Song, Chunhe
2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 596 - 603

← 1 2 3 4 5 →