Performance Comparison of Vision Transformer-Based Models in Medical Image Classification

被引:1
|
作者
Kanca, Elif [1 ]
Ayas, Selen [2 ]
Kablan, Elif Baykal [1 ]
Ekinci, Murat [2 ]
机构
[1] Karadeniz Tech Univ, Yazilim Muhendisligi, Trabzon, Turkiye
[2] Karadeniz Tech Univ, Bilgisayar Muhendisligi, Trabzon, Turkiye
关键词
Vision transformer-based models; transformers; medical image classification;
D O I
10.1109/SIU59756.2023.10223892
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, convolutional neural networks have shown significant success and are frequently used in medical image analysis applications. However, the convolution process in convolutional neural networks limits learning of long-term pixel dependencies in the local receptive field. Inspired by the success of transformer architectures in encoding long-term dependencies and learning more efficient feature representation in natural language processing, publicly available color fundus retina, skin lesion, chest X-ray, and breast histology images are classified using Vision Transformer (ViT), Data-Efficient Transformer (DeiT), Swin Transformer, and Pyramid Vision Transformer v2 (PVTv2) models and their classification performances are compared in this study. The results show that the highest accuracy values are obtained with the DeiT model at 96.5% in the chest X-ray dataset, the PVTv2 model at 91.6% in the breast histology dataset, the PVTv2 model at 91.3% in the retina fundus dataset, and the Swin model at 91.0% in the skin lesion dataset.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] The Application of Vision Transformer in Image Classification
    He, Zhixuan
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63
  • [42] Transformer-based Bug/Feature Classification
    Ozturk, Ceyhun E.
    Yilmaz, Eyup Halit
    Koksal, Omer
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [43] Generalizability of Convolutional Neural Network and Vision Transformer-Based OCT Segmentation Models
    Pely, Adam
    Wu, Zhichao
    Leng, Theodore
    Gao, Simon S.
    Chen, Hao
    Hejrati, Mohsen
    Zhang, Miao
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [44] Transformer-Based Point Cloud Classification
    Wu, Xianfeng
    Liu, Xinyi
    Wang, Junfei
    Wu, Xianzu
    Lai, Zhongyuan
    Zhou, Jing
    Liu, Xia
    ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2022, PT I, 2022, 1700 : 218 - 225
  • [45] Vision Transformer-Based Tailing Detection in Videos
    Lee, Jaewoo
    Lee, Sungjun
    Cho, Wonki
    Siddiqui, Zahid Ali
    Park, Unsang
    APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [46] Vision Transformer-Based Photovoltaic Prediction Model
    Kang, Zaohui
    Xue, Jizhong
    Lai, Chun Sing
    Wang, Yu
    Yuan, Haoliang
    Xu, Fangyuan
    ENERGIES, 2023, 16 (12)
  • [47] Vision Transformer-based pilot pose estimation
    Wu, Honglan
    Liu, Hao
    Sun, Youchao
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (10): : 3100 - 3110
  • [48] TRANSFORMER-BASED SAR IMAGE DESPECKLING
    Perera, Malsha V.
    Bandara, Wele Gedara Chaminda
    Valanarasu, Jeya Maria Jose
    Patel, Vishal M.
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 751 - 754
  • [49] Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification
    Almalik, Faris
    Yaqub, Mohammad
    Nandakumar, Karthik
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III, 2022, 13433 : 376 - 386
  • [50] CRAT: Advanced transformer-based deep learning algorithms in OCT image classification
    Yang, Mingming
    Du, Junhui
    Lv, Ruichan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 104