Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training

被引:0
|
作者
Johnson, Rie [1 ]
Zhang, Tong [2 ,3 ]
机构
[1] RJ Res Consulting, New York, NY 11215 USA
[2] HKUST, Hong Kong, Peoples R China
[3] Google Res, Mountain View, CA USA
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As deep neural networks are highly expressive, it is important to find solutions with small generalization gap (the difference between the performance on the training data and unseen data). Focusing on the stochastic nature of training, we first present a theoretical analysis in which the bound of generalization gap depends on what we call inconsistency and instability of model outputs, which can be estimated on unlabeled data. Our empirical study based on this analysis shows that instability and inconsistency are strongly predictive of generalization gap in various settings. In particular, our finding indicates that inconsistency is a more reliable indicator of generalization gap than the sharpness of the loss landscape. Furthermore, we show that algorithmic reduction of inconsistency leads to superior performance. The results also provide a theoretical basis for existing methods such as co-distillation and ensemble.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Fourier Imager Network (FIN): A deep neural network for hologram reconstruction with superior external generalization
    Hanlong Chen
    Luzhe Huang
    Tairan Liu
    Aydogan Ozcan
    Light: Science & Applications, 11
  • [42] Fourier Imager Network (FIN): A deep neural network for hologram reconstruction with superior external generalization
    Chen, Hanlong
    Huang, Luzhe
    Liu, Tairan
    Ozcan, Aydogan
    LIGHT-SCIENCE & APPLICATIONS, 2022, 11 (01)
  • [43] Transferring deep convolutional neural network models for generalization mapping of autumn crops
    Zhang F.
    Zhang J.
    Duan Y.
    Yang Z.
    National Remote Sensing Bulletin, 2024, 28 (03) : 661 - 676
  • [44] Train longer, generalize better: closing the generalization gap in large batch training of neural networks
    Hoffer, Elad
    Hubara, Itay
    Soudry, Daniel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [45] ON THE TRAINING AND GENERALIZATION OF DEEP OPERATOR NETWORKS
    Lee, Sanghyun
    Shin, Yeonjong
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2024, 46 (04): : C273 - C296
  • [46] Deep Neural Network Training Method Based on Individual Differences of Training Samples
    Li X.
    Liu M.
    Liu M.-H.
    Jiang Q.
    Cao Y.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (12): : 4534 - 4544
  • [47] Bridging the Gap: Unifying the Training and Evaluation of Neural Network Binary Classifiers
    Tsoi, Nathan
    Candon, Kate
    Li, Deyuan
    Milkessa, Yofti
    Vazquez, Marynel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [48] A new perspective for understanding generalization gap of deep neural networks trained with large batch sizes
    Oyedotun, Oyebade K.
    Papadopoulos, Konstantinos
    Aouada, Djamila
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15621 - 15637
  • [49] A new perspective for understanding generalization gap of deep neural networks trained with large batch sizes
    Oyebade K. Oyedotun
    Konstantinos Papadopoulos
    Djamila Aouada
    Applied Intelligence, 2023, 53 : 15621 - 15637
  • [50] Principled deep neural network training through linear programming
    Bienstock, Daniel
    Munoz, Gonzalo
    Pokutta, Sebastian
    DISCRETE OPTIMIZATION, 2023, 49