Enhancing the Accuracy of an Image Classification Model Using Cross-Modality Transfer Learning

被引：0

作者：

Liu, Jiaqi ^{[1
]}

Chui, Kwok Tai ^{[1
]}

Lee, Lap-Kei ^{[1
]}

机构：

[1] Hong Kong Metropolitan Univ, Sch Sci & Technol, Dept Elect Engn & Comp Sci, Hong Kong, Peoples R China

来源：

ELECTRONICS | 2023年 / 12卷 / 15期

关键词：

batch size; cross-modality; deep learning; image classification; learning rate; overfitting; text classification; transfer learning;

D O I：

10.3390/electronics12153316

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Applying deep learning (DL) algorithms for image classification tasks becomes more challenging with insufficient training data. Transfer learning (TL) has been proposed to address these problems. In theory, TL requires only a small amount of knowledge to be transferred to the target task, but traditional transfer learning often requires the presence of the same or similar features in the source and target domains. Cross-modality transfer learning (CMTL) solves this problem by learning knowledge in a source domain completely different from the target domain, often using a source domain with a large amount of data, which helps the model learn more features. Most existing research on CMTL has focused on image-to-image transfer. In this paper, the CMTL problem is formulated from the text domain to the image domain. Our study started by training two separately pre-trained models in the text and image domains to obtain the network structure. The knowledge of the two pre-trained models was transferred via CMTL to obtain a new hybrid model (combining the BERT and BEiT models). Next, GridSearchCV and 5-fold cross-validation were used to identify the most suitable combination of hyperparameters (batch size and learning rate) and optimizers (SGDM and ADAM) for our model. To evaluate their impact, 48 two-tuple hyperparameters and two well-known optimizers were used. The performance evaluation metrics were validation accuracy, F1-score, precision, and recall. The ablation study confirms that the hybrid model enhanced accuracy by 12.8% compared with the original BEiT model. In addition, the results show that these two hyperparameters can significantly impact model performance.

引用

页数：26

共 50 条

[1] Coral Classification Using DenseNet and Cross-modality Transfer Learning
Xu, Lian
Bennamoun, Mohammed
Boussaid, Farid
Ana, Senjian
Sohel, Ferdous
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[2] Cross-Modality Contrastive Learning for Hyperspectral Image Classification
Hang, Renlong
Qian, Xuwei
Liu, Qingshan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[3] Representation Learning for Cross-Modality Classification
van Tulder, Gijs
de Bruijne, Marleen
MEDICAL COMPUTER VISION AND BAYESIAN AND GRAPHICAL MODELS FOR BIOMEDICAL IMAGING, 2017, 10081 : 126 - 136
[4] LOCAL CROSS-MODALITY IMAGE ALIGNMENT USING UNSUPERVISED LEARNING
BERNANDER, O
KOCH, C
LECTURE NOTES IN COMPUTER SCIENCE, 1990, 427 : 573 - 575
[5] Cross-Modality Transfer Learning for Image-Text Information Management
Niu, Shuteng
Jiang, Yushan
Chen, Bowen
Wang, Jian
Liu, Yongxin
Song, Houbing
ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (01)
[6] Cross-Organ, Cross-Modality Transfer Learning: Feasibility Study for Segmentation and Classification
Lee, Juhun
Nishikawa, Robert M.
IEEE ACCESS, 2020, 8 : 210194 - 210205
[7] CROSS-MODALITY MEDICAL IMAGE DETECTION AND SEGMENTATION BY TRANSFER LEARNING OF SHAPE PRIORS
Zheng, Yefeng
2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 424 - 427
[8] Learning cross-modality features for image caption generation
Zeng, Chao
Kwong, Sam
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (07) : 2059 - 2070
[9] Learning cross-modality features for image caption generation
Chao Zeng
Sam Kwong
International Journal of Machine Learning and Cybernetics, 2022, 13 : 2059 - 2070
[10] Addressing imaging accessibility by cross-modality transfer learning
Zheng, Zhiyang
Su, Yi
Chen, Kewei
Weidman, David A.
Wu, Teresa
Lo, Ben
Lure, Fleming
Li, Jing
MEDICAL IMAGING 2022: COMPUTER-AIDED DIAGNOSIS, 2022, 12033

← 1 2 3 4 5 →