Self-Distillation Amplifies Regularization in Hilbert Space

被引:0
|
作者
Mobahi, Hossein [1 ]
Farajtabar, Mehrdad [2 ]
Bartlett, Peter L. [1 ,3 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
[2] DeepMind, Mountain View, CA USA
[3] Univ Calif Berkeley, Dept EECS, Berkeley, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation introduced in the deep learning context is a method to transfer knowledge from one architecture to another. In particular, when the architectures are identical, this is called self-distillation. The idea is to feed in predictions of the trained model as new target values for retraining (and iterate this loop possibly a few times). It has been empirically observed that the self-distilled model often achieves higher accuracy on held out data. Why this happens, however, has been a mystery: the self-distillation dynamics does not receive any new information about the task and solely evolves by looping over training. To the best of our knowledge, there is no rigorous understanding of why this happens. This work provides the first theoretical analysis of self-distillation. We focus on fitting a nonlinear function to training data, where the model space is Hilbert space and fitting is subject to ?(2) regularization in this function space. We show that self-distillation iterations modify regularization by progressively limiting the number of basis functions that can be used to represent the solution. This implies (as we also verify empirically) that while a few rounds of self-distillation may reduce over-fitting, further rounds may lead to under-fitting and thus worse performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
    Wu, Di
    Chen, Pengfei
    Yu, Xuehui
    Li, Guorong
    Han, Zhenjun
    Jiao, Jianbin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6832 - 6842
  • [42] Transferable adversarial masked self-distillation for unsupervised domain adaptation
    Yuelong Xia
    Li-Jun Yun
    Chengfu Yang
    Complex & Intelligent Systems, 2023, 9 : 6567 - 6580
  • [43] Self-distillation enhanced adaptive pruning of convolutional neural networks
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    Liu, Shuai
    He, Yuefeng
    PATTERN RECOGNITION, 2025, 157
  • [44] Stochastic Ghost Batch for Self-distillation with Dynamic Soft Label
    Li, Qian
    Hu, Qingyuan
    Qi, Saiyu
    Qi, Yong
    Wu, Di
    Lin, Yun
    Dong, Jin Song
    KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [45] LGD: Label-Guided Self-Distillation for Object Detection
    Zhang, Peizhen
    Kang, Zijian
    Yang, Tong
    Zhang, Xiangyu
    Zheng, Nanning
    Sun, Jian
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3309 - 3317
  • [46] Unbiased scene graph generation using the self-distillation method
    Bo Sun
    Zhuo Hao
    Lejun Yu
    Jun He
    The Visual Computer, 2024, 40 : 2381 - 2390
  • [47] Enhancing learning on uncertain pixels in self-distillation for object segmentation
    Chen, Lei
    Cao, Tieyong
    Zheng, Yunfei
    Wang, Yang
    Zhang, Bo
    Yang, Jibin
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6545 - 6557
  • [48] Self-distillation framework for indoor and outdoor monocular depth estimation
    Meng Pan
    Huanrong Zhang
    Jiahao Wu
    Zhi Jin
    Multimedia Tools and Applications, 2022, 81 : 35899 - 35913
  • [49] Self-distillation framework for indoor and outdoor monocular depth estimation
    Pan, Meng
    Zhang, Huanrong
    Wu, Jiahao
    Jin, Zhi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 35899 - 35913
  • [50] The Self-Distillation HRNet Object Segmentation Based on the Pyramid Knowledge
    Zheng Y.-F.
    Wang X.-B.
    Zhang X.-W.
    Cao T.-Y.
    Sun M.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (03): : 746 - 756