Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training

被引:0
|
作者
Johnson, Rie [1 ]
Zhang, Tong [2 ,3 ]
机构
[1] RJ Res Consulting, New York, NY 11215 USA
[2] HKUST, Hong Kong, Peoples R China
[3] Google Res, Mountain View, CA USA
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As deep neural networks are highly expressive, it is important to find solutions with small generalization gap (the difference between the performance on the training data and unseen data). Focusing on the stochastic nature of training, we first present a theoretical analysis in which the bound of generalization gap depends on what we call inconsistency and instability of model outputs, which can be estimated on unlabeled data. Our empirical study based on this analysis shows that instability and inconsistency are strongly predictive of generalization gap in various settings. In particular, our finding indicates that inconsistency is a more reliable indicator of generalization gap than the sharpness of the loss landscape. Furthermore, we show that algorithmic reduction of inconsistency leads to superior performance. The results also provide a theoretical basis for existing methods such as co-distillation and ensemble.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] Quantification on the Generalization Performance of Deep Neural Network with Tychonoff Separation Axioms
    Pinto, Linu
    Gopalan, Sasi
    Balasubramaniam, P.
    INFORMATION SCIENCES, 2022, 608 : 262 - 285
  • [22] A deep convolutional neural network for topology optimization with perceptible generalization ability
    Wang, Dalei
    Xiang, Cheng
    Pan, Yue
    Chen, Airong
    Zhou, Xiaoyi
    Zhang, Yiquan
    ENGINEERING OPTIMIZATION, 2022, 54 (06) : 973 - 988
  • [23] Voltage Instability Prediction Using a Deep Recurrent Neural Network
    Hagmar, Hannes
    Tong, Lang
    Eriksson, Robert
    Tuan, Le Anh
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (01) : 17 - 27
  • [24] CHARACTERIZING COMBUSTION INSTABILITY USING DEEP CONVOLUTIONAL NEURAL NETWORK
    Gangopadhyay, Tryambak
    Locurto, Anthony
    Boor, Paige
    Michael, James B.
    Sarkar, Soumik
    PROCEEDINGS OF THE ASME 11TH ANNUAL DYNAMIC SYSTEMS AND CONTROL CONFERENCE, 2018, VOL 1, 2018,
  • [25] Towards Deep Neural Network Training on Encrypted Data
    Nandakumar, Karthik
    Ratha, Nalini
    Pankanti, Sharath
    Halevi, Shai
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 40 - 48
  • [26] Dedicated Deep Neural Network Architectures and Methods for Their Training
    Rozycki, P.
    Kolbusz, J.
    Wilamowski, B. M.
    INES 2015 - IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS, 2015, : 73 - 78
  • [27] Universal mean-field upper bound for the generalization gap of deep neural networks
    Ariosto, S.
    Pacelli, R.
    Ginelli, F.
    Gherardi, M.
    Rotondo, P.
    PHYSICAL REVIEW E, 2022, 105 (06)
  • [28] MEMORY REDUCTION METHOD FOR DEEP NEURAL NETWORK TRAINING
    Shirahata, Koichi
    Tomita, Yasumoto
    Ike, Atsushi
    2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [29] Distributed Deep Neural Network Training on Edge Devices
    Benditkis, Daniel
    Keren, Aviv
    Mor-Yosef, Liron
    Avidor, Tomer
    Shoham, Neta
    Tal-Israel, Nadav
    SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 304 - 306
  • [30] Evolving a Deep Neural Network Training Time Estimator
    Pinel, Frederic
    Yin, Jian-xiong
    Hundt, Christian
    Kieffer, Emmanuel
    Varrette, Sebastien
    Bouvry, Pascal
    See, Simon
    OPTIMIZATION AND LEARNING, 2020, 1173 : 13 - 24