Boosting Residual Networks with Group Knowledge

被引:0
|
作者
Tang, Shengji [1 ]
Ye, Peng [1 ]
Li, Baopu
Lin, Weihao [1 ]
Chen, Tao [1 ]
He, Tong [3 ]
Yu, Chong [2 ]
Ouyang, Wanli [3 ]
机构
[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China
[3] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.
引用
收藏
页码:5162 / 5170
页数:9
相关论文
共 50 条
  • [21] Boosting an evolution strategy with a preprocessing step: application to group scheduling problem in directional sensor networks
    Srivastava, Gaurav
    Singh, Alok
    APPLIED INTELLIGENCE, 2018, 48 (12) : 4760 - 4774
  • [22] Network boosting on different networks
    Wang, Shijun
    Kou, Zhongbao
    Zhang, Changshui
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2006, 366 (01) : 561 - 570
  • [23] Boosting in probabilistic neural networks
    Grim, J
    Pudil, P
    Somol, P
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2002, : 136 - 139
  • [24] Boosting the performance of Myrinet networks
    Flich, J
    López, P
    Malumbres, MP
    Duato, J
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2002, 13 (07) : 693 - 709
  • [25] Boosting with prior knowledge for call classification
    Schapire, RE
    Rochery, M
    Rahim, M
    Gupta, N
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (02): : 174 - 181
  • [26] Optimization Method of Residual Networks of Residual Networks for Image Classification
    Zhang, Ke
    Guo, Liru
    Gao, Ce
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 321 - 325
  • [27] Optimization Method of Residual Networks of Residual Networks for Image Classification
    Lin, Long
    Yuan, Hao
    Guo, Liru
    Kuang, Yingqun
    Zhang, Ke
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2018, PT III, 2018, 10956 : 212 - 222
  • [28] Zero-Knowledge-Private Counting of Group Triangles in Social Networks
    Shoaran, Maryam
    Thomo, Alex
    COMPUTER JOURNAL, 2017, 60 (01): : 126 - 134
  • [29] Aspect-based sentiment analysis via dual residual networks with sentiment knowledge
    Zhu, Chao
    Ding, Qiang
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [30] Boosting Object Retrieval With Group Queries
    Chen, Yanzhi
    Li, Xi
    Dick, Anthony
    van den Hengel, Anton
    IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (11) : 765 - 768