Redundant representations help generalization in wide neural networks

被引:0
|
作者
Doimo, Diego [1 ]
Glielmo, Aldo [2 ]
Goldt, Sebastian [1 ]
Laio, Alessandro [1 ]
机构
[1] Scuola Int Super Studi Avanzati, Trieste, Italy
[2] Bank Italy, Int Sch Adv Studies, Trieste, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN that interpolates its training data will typically improve its generalization performance. Explaining the mechanism behind this "benign overfitting" in deep networks remains an outstanding challenge. Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information and differ from each other only by statistically independent noise. The number of such groups increases linearly with the width of the layer, but only if the width is above a critical value. We show that redundant neurons appear only when the training is regularized and the training error is zero.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Redundant representations help generalization in wide neural networks
    Doimo, Diego
    Glielmo, Aldo
    Goldt, Sebastian
    Laio, Alessandro
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2023, 2023 (11):
  • [2] Representations and generalization in artificial and brain neural networks
    Li, Qianyi
    Sorscher, Ben
    Sompolinsky, Haim
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (27)
  • [3] Learning structured and non-redundant representations with deep neural networks
    Yang, Jihai
    Xiong, Wei
    Li, Shijun
    Xu, Chang
    PATTERN RECOGNITION, 2019, 86 : 224 - 235
  • [4] Testing the generalization of neural representations
    Sandhaeger, Florian
    Siegel, Markus
    NEUROIMAGE, 2023, 278
  • [5] Learning Invariant Representations of Graph Neural Networks via Cluster Generalization
    Xia, Donglin
    Wang, Xiao
    Liu, Nian
    Shi, Chuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Investigating latent representations and generalization in deep neural networks for tabular data
    Couplet, Edouard
    Lambert, Pierre
    Verleysen, Michel
    Lee, John A.
    de Bodt, Cyril
    NEUROCOMPUTING, 2024, 597
  • [7] Compositional generalization through abstract representations in human and artificial neural networks
    Ito, Takuya
    Klinger, Tim
    Schultz, Douglas H.
    Murray, John D.
    Cole, Michael W.
    Rigotti, Mattia
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
    Yang, Jianyi
    Ren, Shaolei
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [10] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32