Training Linear Neural Networks: Non-Local Convergence and Complexity Results

被引:0
|
作者
Eftekhari, Armin [1 ]
机构
[1] Umea Univ, Dept Math & Math Stat, Umea, Sweden
关键词
PRINCIPAL COMPONENTS; MATRIX; APPROXIMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linear networks provide valuable insights into the workings of neural networks in general. This paper identifies conditions under which the gradient flow provably trains a linear network, in spite of the non-strict saddle points present in the optimization landscape. This paper also provides the computational complexity of training linear networks with gradient flow. To achieve these results, this work develops a machinery to provably identify the stable set of gradient flow, which then enables us to improve over the state of the art in the literature of linear networks (Bah et al., 2019; Arora et al., 2018a). Crucially, our results appear to be the first to break away from the lazy training regime which has dominated the literature of neural networks. This work requires the network to have a layer with one neuron, which subsumes the networks with a scalar output, but extending the results of this theoretical work to all linear networks remains a challenging open problem.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] On the Convergence of the IRLS Algorithm in Non-Local Patch Regression
    Chaudhury, Kunal N.
    IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (08) : 815 - 818
  • [32] NON-LOCAL BOX COMPLEXITY AND SECURE FUNCTION EVALUATION
    Kaplan, Marc
    Kerenidis, Iordanis
    Laplante, Sophie
    Roland, Jeremie
    QUANTUM INFORMATION & COMPUTATION, 2011, 11 (1-2) : 40 - 69
  • [33] Non-local box complexity and secure function evaluation
    Kaplan, Marc
    Kerenidis, Iordanis
    Laplante, Sophie
    Roland, Jérémie
    Quantum Information and Computation, 2011, 11 (1-2): : 40 - 69
  • [34] Convergence of local type Dirichlet forms to a non-local type one
    Ogura, Y
    Tomisaki, M
    Tsuchiya, M
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2002, 38 (04): : 507 - 556
  • [35] Blow-up and convergence results for a one-dimensional non-local parabolic problem
    Rougirel, A
    ZEITSCHRIFT FUR ANALYSIS UND IHRE ANWENDUNGEN, 2001, 20 (01): : 93 - 113
  • [36] Non-local linear response in anomalous transport
    Kundu, Anupam
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2023, 2023 (11):
  • [37] Neural Coding: Non-Local but Explicit and Conceptual
    Foeldiak, Peter
    CURRENT BIOLOGY, 2009, 19 (19) : R904 - R906
  • [38] Γ-convergence of non-local, non-convex functionals in one dimension
    Brezis, Haim
    Hoai-Minh Nguyen
    COMMUNICATIONS IN CONTEMPORARY MATHEMATICS, 2020, 22 (07)
  • [39] Non-convergence of stochastic gradient descent in the training of deep neural networks
    Cheridito, Patrick
    Jentzen, Arnulf
    Rossmannek, Florian
    JOURNAL OF COMPLEXITY, 2021, 64
  • [40] Complexity of frustration: A new source of non-local non-stabilizerness
    Odavic, Jovan
    Haug, Tobias
    Torre, Gianpaolo
    Hamma, Alioscia
    Franchini, Fabio
    Giampaolo, Salvatore Marco
    SCIPOST PHYSICS, 2023, 15 (04):