Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

被引:13
|
作者
Asif, Umar [1 ]
Tang, Jianbin [1 ]
Harrer, Stefan [1 ]
机构
[1] IBM Res Australia, Southbank, Vic, Australia
关键词
D O I
10.3233/FAIA200188
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble models comprising of deep Convolutional Neural Networks (CNN) have shown significant improvements in model generalization but at the cost of large computation and memory requirements. In this paper, we present a framework for learning compact CNN models with improved classification performance and model generalization. For this, we propose a CNN architecture of a compact student model with parallel branches which are trained using ground truth labels and information from high capacity teacher networks in an ensemble learning fashion. Our framework provides two main benefits: i) Distilling knowledge from different teachers into the student network promotes heterogeneity in learning features at different branches of the student network and enables the network to learn diverse solutions to the target problem. ii) Coupling the branches of the student network through ensembling encourages collaboration and improves the quality of the final predictions by reducing variance in the network outputs. Experiments on the well established CIFAR-10 and CIFAR-100 datasets show that our Ensemble Knowledge Distillation (EKD) improves classification accuracy and model generalization especially in situations with limited training data. Experiments also show that our EKD based compact networks outperform in terms of mean accuracy on the test datasets compared to other knowledge distillation based methods.
引用
收藏
页码:953 / 960
页数:8
相关论文
共 50 条
  • [21] Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning
    Shen, Jiyuan
    Yang, Wenzhuo
    Chu, Zhaowei
    Fan, Jiani
    Niyato, Dusit
    Lam, Kwok-Yan
    ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 2034 - 2039
  • [22] Leveraging different learning styles for improved knowledge distillation in biomedical imaging
    Niyaz, Usma
    Sambyal, Abhishek Singh
    Bathula, Deepti R.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
  • [23] Leveraging different learning styles for improved knowledge distillation in biomedical imaging
    Niyaz, Usma
    Sambyal, Abhishek Singh
    Bathula, Deepti R.
    Computers in Biology and Medicine, 2024, 168
  • [24] Knowledge Distillation by On-the-Fly Native Ensemble
    Lan, Xu
    Zhu, Xiatian
    Gong, Shaogang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [25] Ensemble Compressed Language Model Based on Knowledge Distillation and Multi-Task Learning
    Xiang, Kun
    Fujii, Akihiro
    2022 7TH INTERNATIONAL CONFERENCE ON BUSINESS AND INDUSTRIAL RESEARCH (ICBIR2022), 2022, : 72 - 77
  • [26] Deep Ensemble Learning by Diverse Knowledge Distillation for Fine-Grained Object Classification
    Okamoto, Naoki
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 502 - 518
  • [27] Improved Feature Distillation via Projector Ensemble
    Chen, Yudong
    Wang, Sen
    Liu, Jiajun
    Xu, Xuwei
    de Hoog, Frank
    Huang, Zi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [28] Energy-Efficient Federated Knowledge Distillation Learning in Internet of Drones
    Cal, Semih
    Sun, Xiang
    Yao, Jingjing
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 1256 - 1261
  • [29] An Efficient and Robust Cloud-Based Deep Learning With Knowledge Distillation
    Tao, Zeyi
    Xia, Qi
    Cheng, Songqing
    Li, Qun
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (02) : 1733 - 1745
  • [30] Deep Knowledge Distillation Learning for Efficient Wearable Data Mining on the Edge
    Wong, Junhua
    Zhang, Qingxue
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,