A mathematical framework for improved weight initialization of neural networks using Lagrange multipliers

被引:7
|
作者
de Pater, Ingeborg [1 ]
Mitici, Mihaela [2 ]
机构
[1] Delft Univ Technol, Fac Aerosp Engn, NL-2926 HS Delft, Netherlands
[2] Univ Utrecht, Fac Sci, Heidelberglaan 8, NL-3584 CS Utrecht, Netherlands
关键词
Weight initialization; Neural network training; Linear regression; Lagrange function; Remaining useful life;
D O I
10.1016/j.neunet.2023.07.035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A good weight initialization is crucial to accelerate the convergence of the weights in a neural network. However, training a neural network is still time-consuming, despite recent advances in weight initialization approaches. In this paper, we propose a mathematical framework for the weight initialization in the last layer of a neural network. We first derive analytically a tight constraint on the weights that accelerates the convergence of the weights during the back-propagation algorithm. We then use linear regression and Lagrange multipliers to analytically derive the optimal initial weights and initial bias of the last layer, that minimize the initial training loss given the derived tight constraint. We also show that the restrictive assumption of traditional weight initialization algorithms that the expected value of the weights is zero is redundant for our approach. We first apply our proposed weight initialization approach to a Convolutional Neural Network that predicts the Remaining Useful Life of aircraft engines. The initial training and validation loss are relatively small, the weights do not get stuck in a local optimum, and the convergence of the weights is accelerated. We compare our approach with several benchmark strategies. Compared to the best performing stateof-the-art initialization strategy (Kaiming initialization), our approach needs 34% less epochs to reach the same validation loss. We also apply our approach to ResNets for the CIFAR-100 dataset, combined with transfer learning. Here, the initial accuracy is already at least 53%. This gives a faster weight convergence and a higher test accuracy than the benchmark strategies.& COPY; 2023 Published by Elsevier Ltd.
引用
收藏
页码:579 / 594
页数:16
相关论文
共 50 条
  • [21] Solving the linear interval tolerance problem for weight initialization of neural networks
    Adam, S. P.
    Karras, D. A.
    Magoulas, G. D.
    Vrahatis, M. N.
    NEURAL NETWORKS, 2014, 54 : 17 - 37
  • [22] Variance-Aware Weight Initialization for Point Convolutional Neural Networks
    Hermosilla, Pedro
    Schelling, Michael
    Ritschel, Tobias
    Ropinski, Timo
    COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 74 - 89
  • [23] A New Weight Initialization Method for Sigmoidal Feedforward Artificial Neural Networks
    Sodhi, Sartaj Singh
    Chandra, Pravin
    Tanwar, Sharad
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 291 - 298
  • [24] Improved content-based brain tumor retrieval for magnetic resonance images using weight initialization framework with densely connected deep neural network
    Singh, Vibhav Prakash
    Verma, Aman
    Singh, Dushyant Kumar
    Maurya, Ritesh
    NEURAL COMPUTING & APPLICATIONS, 2023,
  • [25] Improving convergence and solution quality of Hopfield-type neural networks with augmented Lagrange multipliers
    Li, SZ
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (06): : 1507 - 1516
  • [26] ARTIFICIAL NEURAL NETWORKS USING MOS ANALOG MULTIPLIERS
    HOLLIS, PW
    PAULOS, JJ
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1990, 25 (03) : 849 - 855
  • [27] Training Binarized Neural Networks Using Ternary Multipliers
    Ardakani, Amir
    Ardakani, Arash
    Gross, Warren J.
    IEEE DESIGN & TEST, 2021, 38 (06) : 44 - 52
  • [28] Mutual information based weight initialization method for sigmoidal feedforward neural networks
    Qiao, Junfei
    Li, Sanyi
    Li, Wenjing
    NEUROCOMPUTING, 2016, 207 : 676 - 683
  • [29] Domain adaptation and weight initialization of neural networks for diagnosing interstitial lung diseases
    Thorat, Onkar
    Salvi, Siddharth
    Dedhia, Shrey
    Bhadane, Chetashri
    Dongre, Deepika
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2022, 32 (05) : 1535 - 1547
  • [30] An inhibitory weight initialization improves the speed and quality of recurrent neural networks learning
    Draye, JP
    Pavisic, D
    Cheron, G
    Libert, G
    NEUROCOMPUTING, 1997, 16 (03) : 207 - 224