A mathematical framework for improved weight initialization of neural networks using Lagrange multipliers

被引:7
|
作者
de Pater, Ingeborg [1 ]
Mitici, Mihaela [2 ]
机构
[1] Delft Univ Technol, Fac Aerosp Engn, NL-2926 HS Delft, Netherlands
[2] Univ Utrecht, Fac Sci, Heidelberglaan 8, NL-3584 CS Utrecht, Netherlands
关键词
Weight initialization; Neural network training; Linear regression; Lagrange function; Remaining useful life;
D O I
10.1016/j.neunet.2023.07.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A good weight initialization is crucial to accelerate the convergence of the weights in a neural network. However, training a neural network is still time-consuming, despite recent advances in weight initialization approaches. In this paper, we propose a mathematical framework for the weight initialization in the last layer of a neural network. We first derive analytically a tight constraint on the weights that accelerates the convergence of the weights during the back-propagation algorithm. We then use linear regression and Lagrange multipliers to analytically derive the optimal initial weights and initial bias of the last layer, that minimize the initial training loss given the derived tight constraint. We also show that the restrictive assumption of traditional weight initialization algorithms that the expected value of the weights is zero is redundant for our approach. We first apply our proposed weight initialization approach to a Convolutional Neural Network that predicts the Remaining Useful Life of aircraft engines. The initial training and validation loss are relatively small, the weights do not get stuck in a local optimum, and the convergence of the weights is accelerated. We compare our approach with several benchmark strategies. Compared to the best performing stateof-the-art initialization strategy (Kaiming initialization), our approach needs 34% less epochs to reach the same validation loss. We also apply our approach to ResNets for the CIFAR-100 dataset, combined with transfer learning. Here, the initial accuracy is already at least 53%. This gives a faster weight convergence and a higher test accuracy than the benchmark strategies.& COPY; 2023 Published by Elsevier Ltd.
引用
收藏
页码:579 / 594
页数:16
相关论文
共 50 条
  • [1] Incorrect Application of Yilmaz-Poli (2022) Initialisation Method in dePater-Mitici 2023 paper entitled "A mathematical framework for improved weight initialization of neural networks using Lagrange
    Poli, Riccardo
    Yilmaz, Ahmet
    NEURAL NETWORKS, 2023, 168 : 57 - 58
  • [2] A review on weight initialization strategies for neural networks
    Meenal V. Narkhede
    Prashant P. Bartakke
    Mukul S. Sutaone
    Artificial Intelligence Review, 2022, 55 : 291 - 322
  • [3] A review on weight initialization strategies for neural networks
    Narkhede, Meenal V.
    Bartakke, Prashant P.
    Sutaone, Mukul S.
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (01) : 291 - 322
  • [4] An overview on weight initialization methods for feedforward neural networks
    de Sousa, Celso A. R.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 52 - 59
  • [5] Using autoencoders as a weight initialization method on deep neural networks for disease detection
    Ferreira, Mafalda Falcao
    Camacho, Rui
    Teixeira, Luis F.
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (Suppl 5)
  • [6] Using autoencoders as a weight initialization method on deep neural networks for disease detection
    Mafalda Falcão Ferreira
    Rui Camacho
    Luís F. Teixeira
    BMC Medical Informatics and Decision Making, 20
  • [7] Topology Optimization Using Neural Networks With Conditioning Field Initialization for Improved Efficiency
    Chen, Hongrui
    Joglekar, Aditya
    Burak Kara, Levent
    JOURNAL OF MECHANICAL DESIGN, 2024, 146 (06)
  • [8] TOPOLOGY OPTIMIZATION USING NEURAL NETWORKS WITH CONDITIONING FIELD INITIALIZATION FOR IMPROVED EFFICIENCY
    Chen, Hongrui
    Joglekar, Aditya
    Kara, Levent Burak
    PROCEEDINGS OF ASME 2023 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2023, VOL 3A, 2023,
  • [9] Analyzing weight distribution of feedforward neural networks and efficient weight initialization
    Go, J
    Baek, B
    Lee, C
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2004, 3138 : 840 - 849
  • [10] Improved weight initialization for deep and narrow feedforward neural network
    Lee, Hyunwoo
    Kim, Yunho
    Yang, Seung Yeop
    Choi, Hayoung
    NEURAL NETWORKS, 2024, 176