Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

被引:0
|
作者
Bai, Zhiwei [1 ]
Luo, Tao [1 ,2 ]
Xu, Zhi-Qin John [1 ]
Zhang, Yaoyu [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Math Sci, Inst Nat Sci, Shanghai 200240, Peoples R China
[2] CMA Shanghai, Shanghai Artificial Intelligence Lab, Shanghai 200240, Peoples R China
[3] Shanghai Ctr Brain Sci & Brain Inspired Technol, Shanghai 200240, Peoples R China
来源
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Deep learning; loss landscape; embedding principle;
D O I
10.4208/csiam-am.SO-2023-0020
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this work, we delve into the relationship between deep and shallow neural networks (NNs), focusing on the critical points of their loss landscapes. We discover an embedding principle in depth that loss landscape of an NN "contains" all critical points of the loss landscapes for shallower NNs. The key tool for our discovery is the critical lifting that maps any critical point of a network to critical manifolds of any deeper network while preserving the outputs. To investigate the practical implications of this principle, we conduct a series of numerical experiments. The results confirm that deep networks do encounter these lifted critical points during training, leading to similar training dynamics across varying network depths. We provide theoretical and empirical evidence that through the lifting operation, the lifted critical points exhibit increased degeneracy. This principle also provides insights into the optimization benefits of batch normalization and larger datasets, and enables practical applications like network layer pruning. Overall, our discovery of the embedding principle in depth uncovers the depth-wise hierarchical structure of deep learning loss landscape, which serves as a solid foundation for the further study about the role of depth for DNNs.
引用
收藏
页码:350 / 389
页数:40
相关论文
共 50 条
  • [41] On Reproducing Semi-dense Depth Map Reconstruction using Deep Convolutional Neural Networks with Perceptual Loss
    Makarov, Ilya
    Maslov, Dmitrii
    Gerasimova, Olga
    Aliev, Vladimir
    Korinevskaya, Alisa
    Sharma, Ujjwal
    Wang, Haoliang
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1080 - 1084
  • [42] CONTRASTIVE-CENTER LOSS FOR DEEP NEURAL NETWORKS
    Qi, Ce
    Su, Fei
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 2851 - 2855
  • [43] Learning Depth From Single Images With Deep Neural Network Embedding Focal Length
    He, Lei
    Wang, Guanghui
    Hu, Zhanyi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4676 - 4689
  • [44] Safety Analysis of Deep Neural Networks
    Guidotti, Dario
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4887 - 4888
  • [45] Sensitivity Analysis of Deep Neural Networks
    Shu, Hai
    Zhu, Hongtu
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4943 - 4950
  • [46] Deep Neural Networks in Semantic Analysis
    Averkin, Alexey
    Yarushev, Sergey
    10TH INTERNATIONAL CONFERENCE ON THEORY AND APPLICATION OF SOFT COMPUTING, COMPUTING WITH WORDS AND PERCEPTIONS - ICSCCW-2019, 2020, 1095 : 846 - 853
  • [47] Discriminant Analysis Deep Neural Networks
    Li, Li
    Doroslovacki, Milos
    Loew, Murray H.
    2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,
  • [48] Empirical loss landscape analysis in deep learning: A survey
    Liang R.
    Liu B.
    Sun Y.
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2023, 43 (03): : 813 - 823
  • [49] Deep Learning Neural Networks and Bayesian Neural Networks in Data Analysis
    Chernoded, Andrey
    Dudko, Lev
    Myagkov, Igor
    Volkov, Petr
    XXIII INTERNATIONAL WORKSHOP HIGH ENERGY PHYSICS AND QUANTUM FIELD THEORY (QFTHEP 2017), 2017, 158
  • [50] Principle Components Analysis based on a modified Neural Networks
    Jin, Xiaoyi
    Proceedings of the First International Conference on Information and Management Sciences, 2002, 1 : 109 - 111