Spatially heterogeneous learning by a deep student machine

被引:0
|
作者
Yoshino, Hajime [1 ,2 ]
机构
[1] Osaka Univ, Cybermedia Ctr, Toyonaka, Osaka 5600043, Japan
[2] Osaka Univ, Grad Sch Sci, Toyonaka, Osaka 5600043, Japan
来源
PHYSICAL REVIEW RESEARCH | 2023年 / 5卷 / 03期
关键词
NEURAL-NETWORK; TRANSITION; STATES; SPACE;
D O I
10.1103/PhysRevResearch.5.033068
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Despite spectacular successes, deep neural networks (DNNs) with a huge number of adjustable parameters remain largely black boxes. To shed light on the hidden layers of DNNs, we study supervised learning by a DNN of width N and depth L consisting of NL perceptrons with c inputs by a statistical mechanics approach called the teacher-student setting. We consider an ensemble of student machines that exactly reproduce M sets of N-dimensional input/output relations provided by a teacher machine. We show that the statistical mechanics problem becomes exactly solvable in a high-dimensional limit which we call a "dense limit": N >> c >> 1 and M >> 1 with fixed & alpha; = M/c using the replica method developed by Yoshino [SciPost Phys. Core 2, 005 (2020)] In conjunction with the theoretical study, we also study the model numerically performing simple greedy Monte Carlo simulations. Simulations reveal that learning by the DNN is quite heterogeneous in the network space: configurations of the teacher and the student machines are more correlated within the layers closer to the input/output boundaries, while the central region remains much less correlated due to the overparametrization in qualitative agreement with the theoretical prediction. We evaluate the generalization error of the DNN with various depths L both theoretically and numerically. Remarkably, both the theory and the simulation suggest that the generalization ability of the student machines, which are only weakly correlated with the teacher in the center, does not vanish even in the deep limit L >> 1, where the system becomes heavily overparametrized. We also consider the impact of the effective dimension D(N) of data by incorporating the hidden manifold model [Goldt, Mezard, Krzakala, and Zdevorova, Phys. Rev. X 10, 041044 (2020)] into our model. Replica theory implies that the loop corrections to the dense limit, which reflect correlations between different nodes in the network, become enhanced by either decreasing the width N or decreasing the effective dimension D of the data. Simulation suggests that both lead to significant improvements in generalization ability.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Evaluating Student Engagement with Deep Learning
    Wang N.
    Wang Q.
    Data Analysis and Knowledge Discovery, 2023, 7 (06) : 123 - 133
  • [32] Getting DEEP with machine learning
    Cranford, Steve
    MATTER, 2023, 6 (10) : 3113 - 3116
  • [33] A Review of Deep Machine Learning
    Benuwa, Ben-Bright
    Zhan, Yongzhao
    Ghansah, Benjamin
    Wornyo, Dickson Keddy
    Kataka, Frank Banaseka
    INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH IN AFRICA, 2016, 24 : 124 - 136
  • [34] A deep learning framework for Hybrid Heterogeneous Transfer Learning
    Zhou, Joey Tianyi
    Pan, Sinno Jialin
    Tsang, Ivor W.
    ARTIFICIAL INTELLIGENCE, 2019, 275 : 310 - 328
  • [35] Hybrid Heterogeneous Transfer Learning through Deep Learning
    Zhou, Joey Tianyi
    Pan, Sinno Jialin
    Tsang, Ivor W.
    Yan, Yan
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2213 - 2219
  • [36] Machine Learning and Deep Learning for Throughput Prediction
    Lee, Dongwon
    Lee, Joohyun
    12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2021), 2021, : 452 - 454
  • [37] Deep Learning and Current Trends in Machine Learning
    Bostan, Atila
    Sengul, Gokhan
    Tirkes, Guzin
    Ekin, Cansu
    Karakaya, Murat
    2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 467 - 470
  • [38] Machine learning and deep learning approaches in IoT
    Javed, Abqa
    Awais, Muhammad
    Shoaib, Muhammad
    Khurshid, Khaldoon S.
    Othman, Mahmoud
    PEERJ COMPUTER SCIENCE, 2023, 9 : 1 - 30
  • [39] Machine learning and deep learning: Introduction and applications
    Nakashima T.
    Zairyo, 2020, 9 (633-639): : 633 - 639
  • [40] Deep Learning and Machine Learning Applications in Biomedicine
    Yan, Peiyi
    Liu, Yaojia
    Jia, Yuran
    Zhao, Tianyi
    APPLIED SCIENCES-BASEL, 2024, 14 (01):