Boosting Residual Networks with Group Knowledge

被引:0
|
作者
Tang, Shengji [1 ]
Ye, Peng [1 ]
Li, Baopu
Lin, Weihao [1 ]
Chen, Tao [1 ]
He, Tong [3 ]
Yu, Chong [2 ]
Ouyang, Wanli [3 ]
机构
[1] Fudan Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[2] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China
[3] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research understands residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of residual networks by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this paper, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet group. Meanwhile, we also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code is at https://github.com/tsj-001/AAAI24-GKT.
引用
收藏
页码:5162 / 5170
页数:9
相关论文
共 50 条
  • [41] Boosting Anonymity Wireless Sensor Networks
    Kumar, Vicky
    Kumar, Ashok
    Singh, Manjeet
    PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K17), 2017, : 344 - 348
  • [42] Managing Domain Knowledge and Multiple Models with Boosting
    Zang, Peng
    Isbell, Charles
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1144 - 1149
  • [43] Boosting prior knowledge in streaming variational Bayes
    Duc Anh Nguyen
    Van Linh Ngo
    Kim Anh Nguyen
    Canh Hao Nguyen
    Khoat Than
    NEUROCOMPUTING, 2021, 424 : 143 - 159
  • [44] Boosting Contrastive Learning with Relation Knowledge Distillation
    Zheng, Kai
    Wang, Yuanjiang
    Yuan, Ye
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3508 - 3516
  • [45] A dynamics model of knowledge dissemination in a WeChat Group from perspective of duplex networks
    Zhu, Hongmiao
    Jin, Zhen
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 454
  • [46] A Conceptual Framework of the Effects of Positive Affect and Affective Relationships on Group Knowledge Networks
    Huang, Meikuan
    SMALL GROUP RESEARCH, 2009, 40 (03) : 323 - 346
  • [47] Invertible Residual Networks
    Behrmann, Jens
    Grathwohl, Will
    Chen, Ricky T. Q.
    Duvenaud, David
    Jacobsen, Joern-Henrik
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [48] Residual Alignment: Uncovering the Mechanisms of Residual Networks
    Li, Jianing
    Papyan, Vardan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Residual Alignment: Uncovering the Mechanisms of Residual Networks
    Li, Jianing
    Papyan, Vardan
    Advances in Neural Information Processing Systems, 2023, 36 : 57660 - 57712
  • [50] Dilated Residual Networks
    Yu, Fisher
    Koltun, Vladlen
    Funkhouser, Thomas
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 636 - 644