How regularization affects the critical points in linear networks

被引:0
|
作者
Taghvaei, Amirhossein [1 ]
Kim, Jin W. [1 ]
Mehta, Prashant G. [1 ]
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with the problem of representing and learning a linear transformation using a linear neural network. In recent years, there is a growing interest in the study of such networks, in part due to the successes of deep learning. The main question of this body of research (and also of our paper) is related to the existence and optimality properties of the critical points of the mean-squared loss function. An additional primary concern of our paper pertains to the robustness of these critical points in the face of (a small amount of) regularization. An optimal control model is introduced for this purpose and a learning algorithm (backprop with weight decay) derived for the same using the Hamilton's formulation of optimal control. The formulation is used to provide a complete characterization of the critical points in terms of the solutions of a nonlinear matrix-valued equation, referred to as the characteristic equation. Analytical and numerical tools from bifurcation theory are used to compute the critical points via the solutions of the characteristic equation.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Function Space and Critical Points of Linear Convolutional Networks
    Kohn, Kathlen
    Montufar, Guido
    Shahverdi, Vahid
    Trager, Matthew
    SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY, 2024, 8 (02): : 333 - 362
  • [2] Linear response despite critical points
    Baladi, Viviane
    NONLINEARITY, 2008, 21 (06) : T81 - T90
  • [3] Critical points in the linear σ model with quarks
    Bowman, E. S.
    Kapusta, J. I.
    PHYSICAL REVIEW C, 2009, 79 (01)
  • [4] Regularization learning and early stopping in linear networks
    Hagiwara, K
    Kuno, K
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL IV, 2000, : 511 - 516
  • [5] Critical points for random Boolean networks
    Lynch, JF
    PHYSICA D-NONLINEAR PHENOMENA, 2002, 172 (1-4) : 49 - 64
  • [8] Detection and classification of critical points for linear metamorphosis
    Nieda, T
    Pasko, A
    Kunii, TL
    2004 INTERNATIONAL CONFERENCE ON CYBERWORLDS, PROCEEDINGS, 2004, : 384 - 391
  • [9] Neural Networks with Comparatively Few Critical Points
    Nitta, Tohru
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 269 - 275
  • [10] Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
    Brechet, Pierre
    Papagiannouli, Katerina
    An, Jing
    Montufar, Guido
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202