Boosting Residual Networks with Group Knowledge

被引:0
|
作者
Tang, Shengji [1 ]
Ye, Peng [1 ]
Li, Baopu
Lin, Weihao [1 ]
Chen, Tao [1 ]
He, Tong [3 ]
Yu, Chong [2 ]
Ouyang, Wanli [3 ]
机构
[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China
[3] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.
引用
收藏
页码:5162 / 5170
页数:9
相关论文
共 50 条
  • [1] Residual Networks Behave Like Boosting Algorithms
    Siu, Chapman
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 31 - 40
  • [2] Boosting Graph Neural Networks via Adaptive Knowledge Distillation
    Guo, Zhichun
    Zhang, Chunhui
    Fan, Yujie
    Tian, Yijun
    Zhang, Chuxu
    Chawla, Nitesh V.
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7793 - 7801
  • [3] Boosting deep neural networks with geometrical prior knowledge: a survey
    Matthias Rath
    Alexandru Paul Condurache
    Artificial Intelligence Review, 57
  • [4] Boosting deep neural networks with geometrical prior knowledge: a survey
    Rath, Matthias
    Condurache, Alexandru Paul
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [5] Relay knowledge distillation for efficiently boosting the performance of shallow networks
    Fu, Shipeng
    Lai, Zhibing
    Zhang, Yulun
    Liu, Yiguang
    Yang, Xiaomin
    NEUROCOMPUTING, 2022, 514 : 512 - 525
  • [6] Group Knowledge Networks: A Framework and an Implementation
    Sharda R.
    Frankwick G.L.
    Turetken O.
    Information Systems Frontiers, 1999, 1 (3) : 221 - 239
  • [7] Enhanced Residual Networks via Mixed Knowledge Fraction
    Tang S.
    Ye P.
    Lin W.
    Chen T.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (04): : 328 - 338
  • [8] Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees
    Nitanda, Atsushi
    Suzuki, Taiji
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2981 - 2990
  • [9] Residual Networks of Residual Networks: Multilevel Residual Networks
    Zhang, Ke
    Sun, Miao
    Han, Tony X.
    Yuan, Xingfang
    Guo, Liru
    Liu, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (06) : 1303 - 1314
  • [10] A VIDEO POST-FILTER DEBLOCKING METHOD BASED ON TEMPORAL BOOSTING RESIDUAL NETWORKS
    Wang, Jianyu
    Liu, Shaohui
    Jiang, Feng
    Sun, Xiaoshuai
    Liu, Yongliang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1174 - 1179