Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training

被引：0

作者：

Johnson, Rie ^{[1
]}

Zhang, Tong ^{[2
,3
]}

机构：

[1] RJ Res Consulting, New York, NY 11215 USA

[2] HKUST, Hong Kong, Peoples R China

[3] Google Res, Mountain View, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

STABILITY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As deep neural networks are highly expressive, it is important to find solutions with small generalization gap (the difference between the performance on the training data and unseen data). Focusing on the stochastic nature of training, we first present a theoretical analysis in which the bound of generalization gap depends on what we call inconsistency and instability of model outputs, which can be estimated on unlabeled data. Our empirical study based on this analysis shows that instability and inconsistency are strongly predictive of generalization gap in various settings. In particular, our finding indicates that inconsistency is a more reliable indicator of generalization gap than the sharpness of the loss landscape. Furthermore, we show that algorithmic reduction of inconsistency leads to superior performance. The results also provide a theoretical basis for existing methods such as co-distillation and ensemble.

引用

页数：27

共 50 条

[41] Fourier Imager Network (FIN): A deep neural network for hologram reconstruction with superior external generalization
Hanlong Chen
Luzhe Huang
Tairan Liu
Aydogan Ozcan
Light: Science & Applications, 11
[42] Fourier Imager Network (FIN): A deep neural network for hologram reconstruction with superior external generalization
Chen, Hanlong
Huang, Luzhe
Liu, Tairan
Ozcan, Aydogan
LIGHT-SCIENCE & APPLICATIONS, 2022, 11 (01)
[43] Transferring deep convolutional neural network models for generalization mapping of autumn crops
Zhang F.
Zhang J.
Duan Y.
Yang Z.
National Remote Sensing Bulletin, 2024, 28 (03) : 661 - 676
[44] Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Hoffer, Elad
Hubara, Itay
Soudry, Daniel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[45] ON THE TRAINING AND GENERALIZATION OF DEEP OPERATOR NETWORKS
Lee, Sanghyun
Shin, Yeonjong
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2024, 46 (04): : C273 - C296
[46] Deep Neural Network Training Method Based on Individual Differences of Training Samples
Li X.
Liu M.
Liu M.-H.
Jiang Q.
Cao Y.
Ruan Jian Xue Bao/Journal of Software, 2022, 33 (12): : 4534 - 4544
[47] Bridging the Gap: Unifying the Training and Evaluation of Neural Network Binary Classifiers
Tsoi, Nathan
Candon, Kate
Li, Deyuan
Milkessa, Yofti
Vazquez, Marynel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[48] A new perspective for understanding generalization gap of deep neural networks trained with large batch sizes
Oyedotun, Oyebade K.
Papadopoulos, Konstantinos
Aouada, Djamila
APPLIED INTELLIGENCE, 2023, 53 (12) : 15621 - 15637
[49] A new perspective for understanding generalization gap of deep neural networks trained with large batch sizes
Oyebade K. Oyedotun
Konstantinos Papadopoulos
Djamila Aouada
Applied Intelligence, 2023, 53 : 15621 - 15637
[50] Principled deep neural network training through linear programming
Bienstock, Daniel
Munoz, Gonzalo
Pokutta, Sebastian
DISCRETE OPTIMIZATION, 2023, 49

← 1 2 3 4 5 →