Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

被引：0

作者：

Bai, Zhiwei ^{[1
]}

Luo, Tao ^{[1
,2
]}

Xu, Zhi-Qin John ^{[1
]}

Zhang, Yaoyu ^{[1
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Math Sci, Inst Nat Sci, Shanghai 200240, Peoples R China

[2] CMA Shanghai, Shanghai Artificial Intelligence Lab, Shanghai 200240, Peoples R China

[3] Shanghai Ctr Brain Sci & Brain Inspired Technol, Shanghai 200240, Peoples R China

来源：

CSIAM TRANSACTIONS ON APPLIED MATHEMATICS | 2024年 / 5卷 / 02期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Deep learning; loss landscape; embedding principle;

D O I：

10.4208/csiam-am.SO-2023-0020

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In this work, we delve into the relationship between deep and shallow neural networks (NNs), focusing on the critical points of their loss landscapes. We discover an embedding principle in depth that loss landscape of an NN "contains" all critical points of the loss landscapes for shallower NNs. The key tool for our discovery is the critical lifting that maps any critical point of a network to critical manifolds of any deeper network while preserving the outputs. To investigate the practical implications of this principle, we conduct a series of numerical experiments. The results confirm that deep networks do encounter these lifted critical points during training, leading to similar training dynamics across varying network depths. We provide theoretical and empirical evidence that through the lifting operation, the lifted critical points exhibit increased degeneracy. This principle also provides insights into the optimization benefits of batch normalization and larger datasets, and enables practical applications like network layer pruning. Overall, our discovery of the embedding principle in depth uncovers the depth-wise hierarchical structure of deep learning loss landscape, which serves as a solid foundation for the further study about the role of depth for DNNs.

引用

页码：350 / 389

页数：40

共 50 条

[1] Embedding Principle of Loss Landscape of Deep Neural Networks
Zhang, Yaoyu
Zhang, Zhongwang
Luo, Tao
Xu, Zhi-Qin John
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[2] The Loss Landscape of Deep Linear Neural Networks: a Second-order Analysis
Achour, El Mehdi
Malgouyres, Francois
Gerchinovitz, Sebastien
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 76
[3] Loss Function Dynamics and Landscape for Deep Neural Networks Trained with Quadratic Loss
M. S. Nakhodnov
M. S. Kodryan
E. M. Lobacheva
D. S. Vetrov
Doklady Mathematics, 2022, 106 : S43 - S62
[4] Loss Function Dynamics and Landscape for Deep Neural Networks Trained with Quadratic Loss
Nakhodnov, M. S.
Kodryan, M. S.
Lobacheva, E. M.
Vetrov, D. S.
DOKLADY MATHEMATICS, 2022, 106 (SUPPL 1) : S43 - S62
[5] Better Loss Landscape Visualization for Deep Neural Networks with Trajectory Information
Ding, Ruiqi
Li, Tao
Huang, Xiaolin
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
[6] Jamming transition as a paradigm to understand the loss landscape of deep neural networks
Geiger, Mario
Spigler, Stefano
d'Ascoli, Stephane
Sagun, Levent
Baity-Jesi, Marco
Biroli, Giulio
Wyart, Matthieu
PHYSICAL REVIEW E, 2019, 100 (01)
[7] Embedding Watermarks into Deep Neural Networks
Uchida, Yusuke
Nagai, Yuki
Sakazawa, Shigeyuki
Satoh, Shin'ichi
PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 274 - 282
[8] An In-Depth Analysis of Distributed Training of Deep Neural Networks
Ko, Yunyong
Choi, Kibong
Seo, Jiwon
Kim, Sang-Wook
2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 994 - 1003
[9] Landscape Classification with Deep Neural Networks
Buscombe, Daniel
Ritchie, Andrew C.
GEOSCIENCES, 2018, 8 (07)
[10] Progressive principle component analysis for compressing deep convolutional neural networks
Zhou, Jing
Qi, Haobo
Chen, Yu
Wang, Hansheng
NEUROCOMPUTING, 2021, 440 : 197 - 206

← 1 2 3 4 5 →