Enhancing the Accuracy of an Image Classification Model Using Cross-Modality Transfer Learning

被引:0
|
作者
Liu, Jiaqi [1 ]
Chui, Kwok Tai [1 ]
Lee, Lap-Kei [1 ]
机构
[1] Hong Kong Metropolitan Univ, Sch Sci & Technol, Dept Elect Engn & Comp Sci, Hong Kong, Peoples R China
关键词
batch size; cross-modality; deep learning; image classification; learning rate; overfitting; text classification; transfer learning;
D O I
10.3390/electronics12153316
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Applying deep learning (DL) algorithms for image classification tasks becomes more challenging with insufficient training data. Transfer learning (TL) has been proposed to address these problems. In theory, TL requires only a small amount of knowledge to be transferred to the target task, but traditional transfer learning often requires the presence of the same or similar features in the source and target domains. Cross-modality transfer learning (CMTL) solves this problem by learning knowledge in a source domain completely different from the target domain, often using a source domain with a large amount of data, which helps the model learn more features. Most existing research on CMTL has focused on image-to-image transfer. In this paper, the CMTL problem is formulated from the text domain to the image domain. Our study started by training two separately pre-trained models in the text and image domains to obtain the network structure. The knowledge of the two pre-trained models was transferred via CMTL to obtain a new hybrid model (combining the BERT and BEiT models). Next, GridSearchCV and 5-fold cross-validation were used to identify the most suitable combination of hyperparameters (batch size and learning rate) and optimizers (SGDM and ADAM) for our model. To evaluate their impact, 48 two-tuple hyperparameters and two well-known optimizers were used. The performance evaluation metrics were validation accuracy, F1-score, precision, and recall. The ablation study confirms that the hybrid model enhanced accuracy by 12.8% compared with the original BEiT model. In addition, the results show that these two hyperparameters can significantly impact model performance.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Coral Classification Using DenseNet and Cross-modality Transfer Learning
    Xu, Lian
    Bennamoun, Mohammed
    Boussaid, Farid
    Ana, Senjian
    Sohel, Ferdous
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [2] Cross-Modality Contrastive Learning for Hyperspectral Image Classification
    Hang, Renlong
    Qian, Xuwei
    Liu, Qingshan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [3] Representation Learning for Cross-Modality Classification
    van Tulder, Gijs
    de Bruijne, Marleen
    MEDICAL COMPUTER VISION AND BAYESIAN AND GRAPHICAL MODELS FOR BIOMEDICAL IMAGING, 2017, 10081 : 126 - 136
  • [4] LOCAL CROSS-MODALITY IMAGE ALIGNMENT USING UNSUPERVISED LEARNING
    BERNANDER, O
    KOCH, C
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 427 : 573 - 575
  • [5] Cross-Modality Transfer Learning for Image-Text Information Management
    Niu, Shuteng
    Jiang, Yushan
    Chen, Bowen
    Wang, Jian
    Liu, Yongxin
    Song, Houbing
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (01)
  • [6] Cross-Organ, Cross-Modality Transfer Learning: Feasibility Study for Segmentation and Classification
    Lee, Juhun
    Nishikawa, Robert M.
    IEEE ACCESS, 2020, 8 : 210194 - 210205
  • [7] CROSS-MODALITY MEDICAL IMAGE DETECTION AND SEGMENTATION BY TRANSFER LEARNING OF SHAPE PRIORS
    Zheng, Yefeng
    2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 424 - 427
  • [8] Learning cross-modality features for image caption generation
    Zeng, Chao
    Kwong, Sam
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (07) : 2059 - 2070
  • [9] Learning cross-modality features for image caption generation
    Chao Zeng
    Sam Kwong
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 2059 - 2070
  • [10] Addressing imaging accessibility by cross-modality transfer learning
    Zheng, Zhiyang
    Su, Yi
    Chen, Kewei
    Weidman, David A.
    Wu, Teresa
    Lo, Ben
    Lure, Fleming
    Li, Jing
    MEDICAL IMAGING 2022: COMPUTER-AIDED DIAGNOSIS, 2022, 12033