Boosting Residual Networks with Group Knowledge

被引：0

作者：

Tang, Shengji ^{[1
]}

Ye, Peng ^{[1
]}

Li, Baopu

Lin, Weihao ^{[1
]}

Chen, Tao ^{[1
]}

He, Tong ^{[3
]}

Yu, Chong ^{[2
]}

Ouyang, Wanli ^{[3
]}

机构：

[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China

[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China

[3] Shanghai AI Lab, Shanghai, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.

引用

页码：5162 / 5170

页数：9

共 50 条

[1] Residual Networks Behave Like Boosting Algorithms
Siu, Chapman
2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 31 - 40
[2] Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Guo, Zhichun
Zhang, Chunhui
Fan, Yujie
Tian, Yijun
Zhang, Chuxu
Chawla, Nitesh V.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7793 - 7801
[3] Boosting deep neural networks with geometrical prior knowledge: a survey
Matthias Rath
Alexandru Paul Condurache
Artificial Intelligence Review, 57
[4] Boosting deep neural networks with geometrical prior knowledge: a survey
Rath, Matthias
Condurache, Alexandru Paul
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
[5] Relay knowledge distillation for efficiently boosting the performance of shallow networks
Fu, Shipeng
Lai, Zhibing
Zhang, Yulun
Liu, Yiguang
Yang, Xiaomin
NEUROCOMPUTING, 2022, 514 : 512 - 525
[6] Group Knowledge Networks: A Framework and an Implementation
Sharda R.
Frankwick G.L.
Turetken O.
Information Systems Frontiers, 1999, 1 (3) : 221 - 239
[7] Enhanced Residual Networks via Mixed Knowledge Fraction
Tang S.
Ye P.
Lin W.
Chen T.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (04): : 328 - 338
[8] Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees
Nitanda, Atsushi
Suzuki, Taiji
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2981 - 2990
[9] Residual Networks of Residual Networks: Multilevel Residual Networks
Zhang, Ke
Sun, Miao
Han, Tony X.
Yuan, Xingfang
Guo, Liru
Liu, Tao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (06) : 1303 - 1314
[10] A VIDEO POST-FILTER DEBLOCKING METHOD BASED ON TEMPORAL BOOSTING RESIDUAL NETWORKS
Wang, Jianyu
Liu, Shaohui
Jiang, Feng
Sun, Xiaoshuai
Liu, Yongliang
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1174 - 1179

← 1 2 3 4 5 →