Boosting Residual Networks with Group Knowledge

被引：0

作者：

Tang, Shengji ^{[1
]}

Ye, Peng ^{[1
]}

Li, Baopu

Lin, Weihao ^{[1
]}

Chen, Tao ^{[1
]}

He, Tong ^{[3
]}

Yu, Chong ^{[2
]}

Ouyang, Wanli ^{[3
]}

机构：

[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China

[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China

[3] Shanghai AI Lab, Shanghai, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.

引用

页码：5162 / 5170

页数：9

共 50 条

[21] Boosting an evolution strategy with a preprocessing step: application to group scheduling problem in directional sensor networks
Srivastava, Gaurav
Singh, Alok
APPLIED INTELLIGENCE, 2018, 48 (12) : 4760 - 4774
[22] Network boosting on different networks
Wang, Shijun
Kou, Zhongbao
Zhang, Changshui
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2006, 366 (01) : 561 - 570
[23] Boosting in probabilistic neural networks
Grim, J
Pudil, P
Somol, P
16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2002, : 136 - 139
[24] Boosting the performance of Myrinet networks
Flich, J
López, P
Malumbres, MP
Duato, J
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2002, 13 (07) : 693 - 709
[25] Boosting with prior knowledge for call classification
Schapire, RE
Rochery, M
Rahim, M
Gupta, N
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (02): : 174 - 181
[26] Optimization Method of Residual Networks of Residual Networks for Image Classification
Zhang, Ke
Guo, Liru
Gao, Ce
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 321 - 325
[27] Optimization Method of Residual Networks of Residual Networks for Image Classification
Lin, Long
Yuan, Hao
Guo, Liru
Kuang, Yingqun
Zhang, Ke
INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2018, PT III, 2018, 10956 : 212 - 222
[28] Zero-Knowledge-Private Counting of Group Triangles in Social Networks
Shoaran, Maryam
Thomo, Alex
COMPUTER JOURNAL, 2017, 60 (01): : 126 - 134
[29] Aspect-based sentiment analysis via dual residual networks with sentiment knowledge
Zhu, Chao
Ding, Qiang
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[30] Boosting Object Retrieval With Group Queries
Chen, Yanzhi
Li, Xi
Dick, Anthony
van den Hengel, Anton
IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (11) : 765 - 768

← 1 2 3 4 5 →