Big2Small: Learning from masked image modelling with heterogeneous self-supervised knowledge distillation

被引：0

作者：

Wang, Ziming ^{[1
]}

Han, Shumin ^{[2
]}

Wang, Xiaodi ^{[2
]}

Hao, Jing ^{[2
]}

Cao, Xianbin ^{[1
]}

Zhang, Baochang ^{[3
,4
]}

机构：

[1] Beihang Univ, Sch Elect & Informat Engn, Beijing, Peoples R China

[2] Baidu Com, Beijing, Peoples R China

[3] Beihang Univ, Sch Artificial Intelligence, Beijing, Peoples R China

[4] Beihang Univ, Beijing, Peoples R China

来源：

IET CYBER-SYSTEMS AND ROBOTICS | 2024年 / 6卷 / 04期

关键词：

artificial intelligence; deep neural network; machine intelligence; machine learning; vision;

D O I：

10.1049/csy2.70002

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Small convolutional neural network (CNN)-based models usually require transferring knowledge from a large model before they are deployed in computationally resource-limited edge devices. Masked image modelling (MIM) methods achieve great success in various visual tasks but remain largely unexplored in knowledge distillation for heterogeneous deep models. The reason is mainly due to the significant discrepancy between the transformer-based large model and the CNN-based small network. In this paper, the authors develop the first heterogeneous self-supervised knowledge distillation (HSKD) based on MIM, which can efficiently transfer knowledge from large transformer models to small CNN-based models in a self-supervised fashion. Our method builds a bridge between transformer-based models and CNNs by training a UNet-style student with sparse convolution, which can effectively mimic the visual representation inferred by a teacher over masked modelling. Our method is a simple yet effective learning paradigm to learn the visual representation and distribution of data from heterogeneous teacher models, which can be pre-trained using advanced self-supervised methods. Extensive experiments show that it adapts well to various models and sizes, consistently achieving state-of-the-art performance in image classification, object detection, and semantic segmentation tasks. For example, in the Imagenet 1K dataset, HSKD improves the accuracy of Resnet-50 (sparse) from 76.98% to 80.01%.

引用

页数：11

共 50 条

[41] SELF-KNOWLEDGE DISTILLATION BASED SELF-SUPERVISED LEARNING FOR COVID-19 DETECTION FROM CHEST X-RAY IMAGES
Li, Guang
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1371 - 1375
[42] Noisy-as-Clean: Learning Self-Supervised Denoising From Corrupted Image
Xu, Jun
Huang, Yuan
Cheng, Ming-Ming
Liu, Li
Zhu, Fan
Xu, Zhou
Shao, Ling
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9316 - 9329
[43] MV-MR: Multi-Views and Multi-Representations for Self-Supervised Learning and Knowledge Distillation
Kinakh, Vitaliy
Drozdova, Mariia
Voloshynovskiy, Slava
ENTROPY, 2024, 26 (06)
[44] ViC-MAE: Self-supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders
Hernandez, Jefferson
Villegas, Ruben
Ordonez, Vicente
COMPUTER VISION-ECCV 2024, PT IV, 2025, 15062 : 444 - 463
[45] Image-Based Vehicle Classification by Synergizing Features from Supervised and Self-Supervised Learning Paradigms
Ma, Shihan
Yang, Jidong J.
ENG, 2023, 4 (01): : 444 - 456
[46] Self-Supervised Learning for Seismic Image Segmentation From Few-Labeled Samples
Monteiro, Bruno A. A.
Oliveira, Hugo
dos Santos, Jefersson A.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[47] Self-Supervised Learning Application on COVID-19 Chest X-ray Image Classification Using Masked AutoEncoder
Xing, Xin
Liang, Gongbo
Wang, Chris
Jacobs, Nathan
Lin, Ai-Ling
BIOENGINEERING-BASEL, 2023, 10 (08):
[48] Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs
Almalki, Amani
Latecki, Longin Jan
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5583 - 5592
[49] KnowMIM: a Self-supervised Pre-training Framework Based on Knowledge-Guided Masked Image Modeling for Retinal Vessel Segmentation
Zhu, Jiuyuan
Chen, Wei
Li, Chen
Xun, Tianci
Tan, Chunjiao
Zheng, Weiwei
Xu, Yingqi
Qiao, Peng
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VIII, 2024, 15023 : 412 - 426
[50] Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging
Wolf, Daniel
Payer, Tristan
Lisson, Catharina Silvia
Lisson, Christoph Gerhard
Beer, Meinrad
Gotz, Michael
Ropinski, Timo
SCIENTIFIC REPORTS, 2023, 13 (01)

← 1 2 3 4 5 →