Weight Evolution: Improving Deep Neural Networks Training through Evolving InferiorWeight Values

被引:1
|
作者
Lin, Zhenquan [1 ]
Guo, Kailing [1 ]
Xing, Xiaofen [2 ]
Xu, Xiangmin [3 ]
机构
[1] South China Univ Technol, Guangzhou, Peoples R China
[2] South China Univ Technol, UBTECH SCUT Union Lab, Guangzhou, Peoples R China
[3] South China Univ Technol, Inst Modern Ind Technol, Zhongshan, Peoples R China
基金
中国国家自然科学基金;
关键词
weight evolution; neural networks; training method;
D O I
10.1145/3474085.3475376
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To obtain good performance, convolutional neural networks are usually over-parameterized. This phenomenon has stimulated two interesting topics: pruning the unimportant weights for compression and reactivating the unimportant weights to make full use of network capability. However, current weight reactivation methods usually reactivate the entire filters, which may not be precise enough. Looking back in history, the prosperity of filter pruning is mainly due to its friendliness to hardware implementation, but pruning at a finer structure level, i.e., weight elements, usually leads to better network performance. We study the problem of weight element reactivation in this paper. Motivated by evolution, we select the unimportant filters and update their unimportant elements by combining them with the important elements of important filters, just like gene crossover to produce better offspring, and the proposed method is called weight evolution (WE). WE is mainly composed of four strategies. We propose a global selection strategy and a local selection strategy and combine them to locate the unimportant filters. A forward matching strategy is proposed to find the matched important filters and a crossover strategy is proposed to utilize the important elements of the important filters for updating unimportant filters. WE is plug-in to existing network architectures. Comprehensive experiments show that WE outperforms the other reactivation methods and plug-in training methods with typical convolutional neural networks, especially lightweight networks. Our code is available at https://github.com/BZQLin/Weight-evolution.
引用
收藏
页码:2176 / 2184
页数:9
相关论文
共 50 条
  • [41] Is normalization indispensable for training deep neural networks?
    Shao, Jie
    Hu, Kai
    Wang, Changhu
    Xue, Xiangyang
    Raj, Bhiksha
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [42] Context-aware Route Recommendation with Weight Learning through Deep Neural Networks
    Jia, Huiwen
    Fang, Jun
    Tan, Naiqiang
    Liu, Xinyue
    Huo, Zengwei
    Ma, Nan
    Wu, Guobin
    Chai, Hua
    Qie, Xiaohu
    Zhang, Bo
    Yin, Yafeng
    Shen, Siqian
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 4040 - 4045
  • [43] On Calibration of Mixup Training for Deep Neural Networks
    Maronas, Juan
    Ramos, Daniel
    Paredes, Roberto
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2020, 2021, 12644 : 67 - 76
  • [44] Exploiting Invariance in Training Deep Neural Networks
    Ye, Chengxi
    Zhou, Xiong
    McKinney, Tristan
    Liu, Yanfeng
    Zhou, Qinggang
    Zhdanov, Fedor
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8849 - 8856
  • [45] Exploring strategies for training deep neural networks
    Larochelle, Hugo
    Bengio, Yoshua
    Louradour, Jérôme
    Lamblin, Pascal
    Journal of Machine Learning Research, 2009, 10 : 1 - 40
  • [46] Training data enhancements for improving colonic polyp detection using deep convolutional neural networks
    Thomaz, Victor de Almeida
    Sierra-Franco, Cesar A.
    Raposo, Alberto B.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111 (111)
  • [47] Training Deep Neural Networks with Gradual Deconvexification
    Lo, Jawes Ting-Ho
    Gui, Yichuan
    Peng, Yun
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1000 - 1007
  • [48] Training Deep Neural Networks for Visual Servoing
    Bateux, Quentin
    Marchand, Eric
    Leitner, Jurgen
    Chaumette, Francois
    Corke, Peter
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 3307 - 3314
  • [49] Local Critic Training of Deep Neural Networks
    Lee, Hojung
    Lee, Jong-Seok
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [50] An Optimization Strategy for Deep Neural Networks Training
    Wu, Tingting
    Zeng, Peng
    Song, Chunhe
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 596 - 603