Model Selection - Knowledge Distillation Framework for Model Compression

被引:0
|
作者
Chen, Renhai [1 ]
Yuan, Shimin [1 ]
Wang, Shaobo [1 ]
Li, Zhenghan [1 ]
Xing, Meng [1 ]
Feng, Zhiyong [1 ]
机构
[1] Tianjin Univ, Shenzhen Res Inst, Coll Intelligence & Comp, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
model selection; model compression; knowledge distillation;
D O I
10.1109/SSCI50451.2021.9659861
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The significant increase in the computation and parameter storage costs of CNNs promotes its development in various applications and restricts its deployment in edge devices as well. Therefore, many neural network pruning methods has been proposed for neural network compression and acceleration. However, there are two major limitations to these methods: First, prevailing methods usually design single pruning criteria for the primitive network and fail to consider the diversity of potential optimal sub-network structure. Second, these methods utilize traditional training method to train the sub-network, which is not enough to develop the expression ability of the sub-network under the current task.In this paper, we propose Model Selection - Knowledge Distillation (MS-KD) framework to solve the above problems. Specifically, we develop multiple pruning criteria for the primitive network, and the potential optimal structure is obtained through model selection.Furthermore, instead of traditional training methods, we use knowledge distillation to train the learned sub-network and make full use of the structure advantages of the sub-network.To validate our approach, we conduct extensive experiments on prevalent image classification datasets.The results demonstrate that our MS-KD framework outperforms the existing methods under a wide range of data sets, models, and inference costs.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression
    Wu, Siyue
    Chen, Hongzhan
    Quan, Xiaojun
    Wang, Qifan
    Wang, Rui
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 8449 - 8465
  • [32] HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression
    Done, Chenhe
    Li, Yaliang
    Shen, Ying
    Qui, Minghui
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3126 - 3136
  • [33] Knowledge Distillation Based on Pruned Model
    Liu, Cailing
    Zhang, Hongyi
    Chen, Deyi
    BLOCKCHAIN AND TRUSTWORTHY SYSTEMS, BLOCKSYS 2019, 2020, 1156 : 598 - 603
  • [34] On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework
    Liu, Sicong
    Lin, Yingyan
    Zhou, Zimu
    Nan, Kaiming
    Liu, Hui
    Du, Junzhao
    MOBISYS'18: PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS, AND SERVICES, 2018, : 389 - 400
  • [35] A hybrid model compression approach via knowledge distillation for predicting energy consumption in additive manufacturing
    Li, Yixin
    Hu, Fu
    Liu, Ying
    Ryan, Michael
    Wang, Ray
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (13) : 4525 - 4547
  • [36] Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks
    Xu, Qing
    Chen, Zhenghua
    Ragab, Mohamed
    Wang, Chao
    Wu, Min
    Li, Xiaoli
    NEUROCOMPUTING, 2022, 485 : 242 - 251
  • [37] Data-Free Ensemble Knowledge Distillation for Privacy-conscious Multimedia Model Compression
    Hao, Zhiwei
    Luo, Yong
    Hu, Han
    An, Jianping
    Wen, Yonggang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1803 - 1811
  • [38] Knowledge distillation for object detection with diffusion model
    Zhang, Yi
    Long, Junzong
    Li, Chunrui
    NEUROCOMPUTING, 2025, 636
  • [39] Efficient Knowledge Distillation from Model Checkpoints
    Wang, Chaofei
    Yang, Qisen
    Huang, Rui
    Song, Shiji
    Huang, Gao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [40] Contrastive Distillation on Intermediate Representations for Language Model Compression
    Sun, Siqi
    Gan, Zhe
    Cheng, Yu
    Fang, Yuwei
    Wang, Shuohang
    Liu, Jingjing
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 498 - 508