Spatially heterogeneous learning by a deep student machine

被引:0
|
作者
Yoshino, Hajime [1 ,2 ]
机构
[1] Osaka Univ, Cybermedia Ctr, Toyonaka, Osaka 5600043, Japan
[2] Osaka Univ, Grad Sch Sci, Toyonaka, Osaka 5600043, Japan
来源
PHYSICAL REVIEW RESEARCH | 2023年 / 5卷 / 03期
关键词
NEURAL-NETWORK; TRANSITION; STATES; SPACE;
D O I
10.1103/PhysRevResearch.5.033068
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Despite spectacular successes, deep neural networks (DNNs) with a huge number of adjustable parameters remain largely black boxes. To shed light on the hidden layers of DNNs, we study supervised learning by a DNN of width N and depth L consisting of NL perceptrons with c inputs by a statistical mechanics approach called the teacher-student setting. We consider an ensemble of student machines that exactly reproduce M sets of N-dimensional input/output relations provided by a teacher machine. We show that the statistical mechanics problem becomes exactly solvable in a high-dimensional limit which we call a "dense limit": N >> c >> 1 and M >> 1 with fixed & alpha; = M/c using the replica method developed by Yoshino [SciPost Phys. Core 2, 005 (2020)] In conjunction with the theoretical study, we also study the model numerically performing simple greedy Monte Carlo simulations. Simulations reveal that learning by the DNN is quite heterogeneous in the network space: configurations of the teacher and the student machines are more correlated within the layers closer to the input/output boundaries, while the central region remains much less correlated due to the overparametrization in qualitative agreement with the theoretical prediction. We evaluate the generalization error of the DNN with various depths L both theoretically and numerically. Remarkably, both the theory and the simulation suggest that the generalization ability of the student machines, which are only weakly correlated with the teacher in the center, does not vanish even in the deep limit L >> 1, where the system becomes heavily overparametrized. We also consider the impact of the effective dimension D(N) of data by incorporating the hidden manifold model [Goldt, Mezard, Krzakala, and Zdevorova, Phys. Rev. X 10, 041044 (2020)] into our model. Replica theory implies that the loop corrections to the dense limit, which reflect correlations between different nodes in the network, become enhanced by either decreasing the width N or decreasing the effective dimension D of the data. Simulation suggests that both lead to significant improvements in generalization ability.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Fusion Deep Learning and Machine Learning for Heterogeneous Military Entity Recognition
    Li, Hui
    Yu, Lin
    Zhang, Jie
    Lyu, Ming
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [2] Fusion Deep Learning and Machine Learning for Heterogeneous Military Entity Recognition
    Li, Hui
    Yu, Lin
    Zhang, Jie
    Lyu, Ming
    Wireless Communications and Mobile Computing, 2022, 2022
  • [3] Predicting Student Dropout based on Machine Learning and Deep Learning: A Systematic Review
    Andrade-Giron, Daniel
    Sandivar-Rosas, Juana
    Marin-Rodriguez, William
    Ramirez, Edgar Susanibar-
    Toro-Dextre, Eliseo
    Ausejo-Sanchez, Jose
    Villarreal-Torres, Henry
    Angeles-Morales, Julio
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2023, 10 (05) : 1 - 11
  • [4] Recognition of classroom student state features based on deep learning algorithms and machine learning
    Hu Jingchao
    Zhang, Haiying
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (02) : 2361 - 2372
  • [5] Classifying spatially heterogeneous wetland communities using machine learning algorithms and spectral and textural features
    Zoltan Szantoi
    Francisco J. Escobedo
    Amr Abd-Elrahman
    Leonard Pearlstine
    Bon Dewitt
    Scot Smith
    Environmental Monitoring and Assessment, 2015, 187
  • [6] Classifying spatially heterogeneous wetland communities using machine learning algorithms and spectral and textural features
    Szantoi, Zoltan
    Escobedo, Francisco J.
    Abd-Elrahman, Amr
    Pearlstine, Leonard
    Dewitt, Bon
    Smith, Scot
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2015, 187 (05) : 1 - 15
  • [7] A comparative study of machine learning and deep learning algorithms for predicting student's academic performance
    Bhushan, Megha
    Vyas, Satyam
    Mall, Shrey
    Negi, Arun
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (06) : 2674 - 2683
  • [8] A comparative study of machine learning and deep learning algorithms for predicting student’s academic performance
    Megha Bhushan
    Satyam Vyas
    Shrey Mall
    Arun Negi
    International Journal of System Assurance Engineering and Management, 2023, 14 : 2674 - 2683
  • [9] Automated classification of acute leukemia on a heterogeneous dataset using machine learning and deep learning techniques
    Abhishek, Arjun
    Jha, Rajib Kumar
    Sinha, Ruchi
    Jha, Kamlesh
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 72
  • [10] Deep Learning-Based Job Placement in Distributed Machine Learning Clusters With Heterogeneous Workloads
    Bao, Yixin
    Peng, Yanghua
    Wu, Chuan
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (02) : 634 - 647