The Loss Surface of Deep and Wide Neural Networks

被引：0

作者：

Quynh Nguyen ^{[1
]}

Hein, Matthias ^{[1
]}

机构：

[1] Saarland Univ, Dept Math & Comp Sci, Saarbrucken, Germany

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷

基金：

欧洲研究理事会;

关键词：

LOCAL MINIMA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a fully connected network with squared loss and analytic activation function given that the number of hidden units of one layer of the network is larger than the number of training points and the network structure from this layer on is pyramidal.

引用

页数：10

共 50 条

[21] Comparison of Loss Functions for Training of Deep Neural Networks in Shogi
Zhu, Hanhua
Kaneko, Tomoyuki
2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 18 - 23
[22] Memorization in Deep Neural Networks: Does the Loss Function Matter?
Patel, Deep
Sastry, P. S.
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 131 - 142
[23] Progressive loss functions for speech enhancement with deep neural networks
Jorge Llombart
Dayana Ribas
Antonio Miguel
Luis Vicente
Alfonso Ortega
Eduardo Lleida
EURASIP Journal on Audio, Speech, and Music Processing, 2021
[24] Progressive loss functions for speech enhancement with deep neural networks
Llombart, Jorge
Ribas, Dayana
Miguel, Antonio
Vicente, Luis
Ortega, Alfonso
Lleida, Eduardo
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
[25] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Cao, Yuan
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[26] Multitask Deep Neural Networks for Tele-Wide Stereo Matching
El-Khamy, Mostafa
Ren, Haoyu
Du, Xianzhi
Lee, Jungwon
IEEE ACCESS, 2020, 8 : 184383 - 184398
[27] Deep or Wide? Learning Policy and Value Neural Networks for Combinatorial Games
Edelkamp, Stefan
COMPUTER GAMES: 5TH WORKSHOP ON COMPUTER GAMES, CGW 2016, AND 5TH WORKSHOP ON GENERAL INTELLIGENCE IN GAME-PLAYING AGENTS, GIGA 2016, HELD IN CONJUNCTION WITH THE 25TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2016, NEW YORK, USA, JULY 9-10, 2016, 2017, 705 : 19 - 33
[28] Wide deep residual networks in networks
Hmidi Alaeddine
Malek Jihene
Multimedia Tools and Applications, 2023, 82 : 7889 - 7899
[29] Wide deep residual networks in networks
Alaeddine, Hmidi
Jihene, Malek
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7889 - 7899
[30] Better Loss Landscape Visualization for Deep Neural Networks with Trajectory Information
Ding, Ruiqi
Li, Tao
Huang, Xiaolin
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222

← 1 2 3 4 5 →