Boosting Residual Networks with Group Knowledge

被引：0

作者：

Tang, Shengji ^{[1
]}

Ye, Peng ^{[1
]}

Li, Baopu

Lin, Weihao ^{[1
]}

Chen, Tao ^{[1
]}

He, Tong ^{[3
]}

Yu, Chong ^{[2
]}

Ouyang, Wanli ^{[3
]}

机构：

[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China

[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China

[3] Shanghai AI Lab, Shanghai, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.

引用

页码：5162 / 5170

页数：9

共 50 条

[41] Boosting Anonymity Wireless Sensor Networks
Kumar, Vicky
Kumar, Ashok
Singh, Manjeet
PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K17), 2017, : 344 - 348
[42] Managing Domain Knowledge and Multiple Models with Boosting
Zang, Peng
Isbell, Charles
20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1144 - 1149
[43] Boosting prior knowledge in streaming variational Bayes
Duc Anh Nguyen
Van Linh Ngo
Kim Anh Nguyen
Canh Hao Nguyen
Khoat Than
NEUROCOMPUTING, 2021, 424 : 143 - 159
[44] Boosting Contrastive Learning with Relation Knowledge Distillation
Zheng, Kai
Wang, Yuanjiang
Yuan, Ye
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3508 - 3516
[45] A dynamics model of knowledge dissemination in a WeChat Group from perspective of duplex networks
Zhu, Hongmiao
Jin, Zhen
APPLIED MATHEMATICS AND COMPUTATION, 2023, 454
[46] A Conceptual Framework of the Effects of Positive Affect and Affective Relationships on Group Knowledge Networks
Huang, Meikuan
SMALL GROUP RESEARCH, 2009, 40 (03) : 323 - 346
[47] Invertible Residual Networks
Behrmann, Jens
Grathwohl, Will
Chen, Ricky T. Q.
Duvenaud, David
Jacobsen, Joern-Henrik
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[48] Residual Alignment: Uncovering the Mechanisms of Residual Networks
Li, Jianing
Papyan, Vardan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[49] Residual Alignment: Uncovering the Mechanisms of Residual Networks
Li, Jianing
Papyan, Vardan
Advances in Neural Information Processing Systems, 2023, 36 : 57660 - 57712
[50] Dilated Residual Networks
Yu, Fisher
Koltun, Vladlen
Funkhouser, Thomas
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 636 - 644

← 1 2 3 4 5 →