Spatially heterogeneous learning by a deep student machine

被引:0
|
作者
Yoshino, Hajime [1 ,2 ]
机构
[1] Osaka Univ, Cybermedia Ctr, Toyonaka, Osaka 5600043, Japan
[2] Osaka Univ, Grad Sch Sci, Toyonaka, Osaka 5600043, Japan
来源
PHYSICAL REVIEW RESEARCH | 2023年 / 5卷 / 03期
关键词
NEURAL-NETWORK; TRANSITION; STATES; SPACE;
D O I
10.1103/PhysRevResearch.5.033068
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Despite spectacular successes, deep neural networks (DNNs) with a huge number of adjustable parameters remain largely black boxes. To shed light on the hidden layers of DNNs, we study supervised learning by a DNN of width N and depth L consisting of NL perceptrons with c inputs by a statistical mechanics approach called the teacher-student setting. We consider an ensemble of student machines that exactly reproduce M sets of N-dimensional input/output relations provided by a teacher machine. We show that the statistical mechanics problem becomes exactly solvable in a high-dimensional limit which we call a "dense limit": N >> c >> 1 and M >> 1 with fixed & alpha; = M/c using the replica method developed by Yoshino [SciPost Phys. Core 2, 005 (2020)] In conjunction with the theoretical study, we also study the model numerically performing simple greedy Monte Carlo simulations. Simulations reveal that learning by the DNN is quite heterogeneous in the network space: configurations of the teacher and the student machines are more correlated within the layers closer to the input/output boundaries, while the central region remains much less correlated due to the overparametrization in qualitative agreement with the theoretical prediction. We evaluate the generalization error of the DNN with various depths L both theoretically and numerically. Remarkably, both the theory and the simulation suggest that the generalization ability of the student machines, which are only weakly correlated with the teacher in the center, does not vanish even in the deep limit L >> 1, where the system becomes heavily overparametrized. We also consider the impact of the effective dimension D(N) of data by incorporating the hidden manifold model [Goldt, Mezard, Krzakala, and Zdevorova, Phys. Rev. X 10, 041044 (2020)] into our model. Replica theory implies that the loop corrections to the dense limit, which reflect correlations between different nodes in the network, become enhanced by either decreasing the width N or decreasing the effective dimension D of the data. Simulation suggests that both lead to significant improvements in generalization ability.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Nonlinear and spatially heterogeneous relationship between environmental factors and violent crime: Based on interpretable machine learning method
    Zhang, Yanji
    Zhu, Chunwu
    Dili Xuebao/Acta Geographica Sinica, 2024, 79 (08): : 2141 - 2156
  • [22] Student dropout prediction in High Education, using machine learning and deep learning models: case of Ecuadorian university
    Davila, Gonzalo
    Haro, Juan
    Gonzalez-Eras, Alexandra
    Ruiz Vivanco, Omar
    Gilman Coronet, Daniel
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1677 - 1684
  • [23] Deep learning for fast spatially varying deconvolution
    Yanny, Kyrollos
    Monakhova, Kristina
    Shuai, Richard W.
    Waller, Laura
    OPTICA, 2022, 9 (01): : 96 - 99
  • [24] THE APPLICATION OF MACHINE LEARNING TO STUDENT MODELING
    SELF, J
    INSTRUCTIONAL SCIENCE, 1986, 14 (3-4) : 327 - 338
  • [25] Machine Learning and Student Performance in Teams
    Ahuja, Rohan
    Khan, Daniyal
    Tahir, Sara
    Wang, Magdalene
    Symonette, Danilo
    Pan, Shimei
    Stacey, Simon
    Engel, Don
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 301 - 305
  • [26] COGNITIVE DIAGNOSIS OF THE STUDENT BY MACHINE LEARNING
    TALBI, M
    JOAB, M
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 608 : 483 - 490
  • [27] Machine Learning Potentials for Heterogeneous Catalysis
    Omranpour, Amir
    Elsner, Jan
    Lausch, K. Nikolas
    Behler, Jorg
    ACS CATALYSIS, 2025, 15 (03): : 1616 - 1634
  • [28] Synthesis and Machine Learning for Heterogeneous Extraction
    Iyer, Arun
    Jonnalagedda, Manohar
    Parthasarathy, Suresh
    Radhakrishna, Arjun
    Rajamani, Sriram K.
    PROCEEDINGS OF THE 40TH ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '19), 2019, : 301 - 315
  • [29] The Impact of Heterogeneous Technology on Machine Learning
    Mo, Jin-ping
    Yang, Qing-lin
    Yang, Xiao-lei
    Qian, Wen-biao
    2018 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND NETWORK TECHNOLOGY (CCNT 2018), 2018, 291 : 134 - 138
  • [30] Machine Learning for Computational Heterogeneous Catalysis
    Lamoureux, Philomena Schlexer
    Winther, Kirsten T.
    Torres, Jose Antonio Garrido
    Streibel, Verena
    Zhao, Meng
    Bajdich, Michal
    Abild-Pedersen, Frank
    Bligaard, Thomas
    CHEMCATCHEM, 2019, 11 (16) : 3579 - 3599