Boosting Residual Networks with Group Knowledge

被引：0

作者：

Tang, Shengji ^{[1
]}

Ye, Peng ^{[1
]}

Li, Baopu

Lin, Weihao ^{[1
]}

Chen, Tao ^{[1
]}

He, Tong ^{[3
]}

Yu, Chong ^{[2
]}

Ouyang, Wanli ^{[3
]}

机构：

[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China

[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China

[3] Shanghai AI Lab, Shanghai, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.

引用

页码：5162 / 5170

页数：9

共 50 条

[31] Deep Residual Learning for Boosting the Accuracy of Hyperspectral Pansharpening
Zheng, Yuxuan
Li, Jiaojiao
Li, Yunsong
Cao, Kailang
Wang, Keyan
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (08) : 1435 - 1439
[32] Functional Gradient Boosting based on Residual Network Perception
Nitanda, Atsushi
Suzuki, Taiji
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[33] Boosting Offline Reinforcement Learning with Residual Generative Modeling
Wei, Hua
Ye, Deheng
Liu, Zhao
Wu, Hao
Yuan, Bo
Fu, Qiang
Yang, Wei
Li, Zhenhui
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3574 - 3580
[34] KNOWLEDGE AND KNOWLEDGE NETWORKS
LAKSHMANAN, TR
BATTEN, DF
ANNALS OF REGIONAL SCIENCE, 1993, 27 (01): : 1 - 3
[35] Deep Residual Networks of Residual Networks for Image Super-Resolution
Wei, Xueqi
Yang, Fumeng
Wu, Congzhong
LIDAR IMAGING DETECTION AND TARGET RECOGNITION 2017, 2017, 10605
[36] Sparse-Group Boosting: Unbiased Group and Variable Selection
Obster, Fabian
Heumann, Christian
AMERICAN STATISTICIAN, 2024,
[37] Wide deep residual networks in networks
Hmidi Alaeddine
Malek Jihene
Multimedia Tools and Applications, 2023, 82 : 7889 - 7899
[38] Wide deep residual networks in networks
Alaeddine, Hmidi
Jihene, Malek
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7889 - 7899
[39] Boosting Simplified Fuzzy Neural Networks
Natekin, Alexey
Knoll, Alois
ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2013, PT I, 2013, 383 : 330 - 339
[40] An Empirical Analysis of Boosting Deep Networks
Rambhatla, Sai Saketh
Jones, Michael J.
Chellappa, Rama
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →