The Loss Surface of Deep and Wide Neural Networks

被引:0
|
作者
Quynh Nguyen [1 ]
Hein, Matthias [1 ]
机构
[1] Saarland Univ, Dept Math & Comp Sci, Saarbrucken, Germany
基金
欧洲研究理事会;
关键词
LOCAL MINIMA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a fully connected network with squared loss and analytic activation function given that the number of hidden units of one layer of the network is larger than the number of training points and the network structure from this layer on is pyramidal.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Comparison of Loss Functions for Training of Deep Neural Networks in Shogi
    Zhu, Hanhua
    Kaneko, Tomoyuki
    2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 18 - 23
  • [22] Memorization in Deep Neural Networks: Does the Loss Function Matter?
    Patel, Deep
    Sastry, P. S.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 131 - 142
  • [23] Progressive loss functions for speech enhancement with deep neural networks
    Jorge Llombart
    Dayana Ribas
    Antonio Miguel
    Luis Vicente
    Alfonso Ortega
    Eduardo Lleida
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [24] Progressive loss functions for speech enhancement with deep neural networks
    Llombart, Jorge
    Ribas, Dayana
    Miguel, Antonio
    Vicente, Luis
    Ortega, Alfonso
    Lleida, Eduardo
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [25] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [26] Multitask Deep Neural Networks for Tele-Wide Stereo Matching
    El-Khamy, Mostafa
    Ren, Haoyu
    Du, Xianzhi
    Lee, Jungwon
    IEEE ACCESS, 2020, 8 : 184383 - 184398
  • [27] Deep or Wide? Learning Policy and Value Neural Networks for Combinatorial Games
    Edelkamp, Stefan
    COMPUTER GAMES: 5TH WORKSHOP ON COMPUTER GAMES, CGW 2016, AND 5TH WORKSHOP ON GENERAL INTELLIGENCE IN GAME-PLAYING AGENTS, GIGA 2016, HELD IN CONJUNCTION WITH THE 25TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2016, NEW YORK, USA, JULY 9-10, 2016, 2017, 705 : 19 - 33
  • [28] Wide deep residual networks in networks
    Hmidi Alaeddine
    Malek Jihene
    Multimedia Tools and Applications, 2023, 82 : 7889 - 7899
  • [29] Wide deep residual networks in networks
    Alaeddine, Hmidi
    Jihene, Malek
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7889 - 7899
  • [30] Better Loss Landscape Visualization for Deep Neural Networks with Trajectory Information
    Ding, Ruiqi
    Li, Tao
    Huang, Xiaolin
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222