Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer

被引:8
|
作者
Song, Bofan [1 ]
Raj, Dharma K. C. [2 ]
Yang, Rubin Yuchan [2 ]
Li, Shaobai [1 ]
Zhang, Chicheng [2 ]
Liang, Rongguang [1 ]
机构
[1] Univ Arizona, Wyant Coll Opt Sci, Tucson, AZ 85721 USA
[2] Univ Arizona, Comp Sci Dept, Tucson, AZ 85721 USA
关键词
Vision Transformer; Swin Transformer; oral cancer; oral image analysis; artificial intelligence;
D O I
10.3390/cancers16050987
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Transformer models, originally successful in natural language processing, have found application in computer vision, demonstrating promising results in tasks related to cancer image analysis. Despite being one of the prevalent and swiftly spreading cancers globally, there is a pressing need for accurate automated analysis methods for oral cancer. This need is particularly critical for high-risk populations residing in low- and middle-income countries. In this study, we evaluated the performance of the Vision Transformer (ViT) and the Swin Transformer in the classification of mobile-based oral cancer images we collected from high-risk populations. The results showed that the Swin Transformer model achieved higher accuracy than the ViT model, and both transformer models work better than the conventional convolution model VGG19.Abstract Oral cancer, a pervasive and rapidly growing malignant disease, poses a significant global health concern. Early and accurate diagnosis is pivotal for improving patient outcomes. Automatic diagnosis methods based on artificial intelligence have shown promising results in the oral cancer field, but the accuracy still needs to be improved for realistic diagnostic scenarios. Vision Transformers (ViT) have outperformed learning CNN models recently in many computer vision benchmark tasks. This study explores the effectiveness of the Vision Transformer and the Swin Transformer, two cutting-edge variants of the transformer architecture, for the mobile-based oral cancer image classification application. The pre-trained Swin transformer model achieved 88.7% accuracy in the binary classification task, outperforming the ViT model by 2.3%, while the conventional convolutional network model VGG19 and ResNet50 achieved 85.2% and 84.5% accuracy. Our experiments demonstrate that these transformer-based architectures outperform traditional convolutional neural networks in terms of oral cancer image classification, and underscore the potential of the ViT and the Swin Transformer in advancing the state of the art in oral cancer image analysis.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] PolySegNet: improving polyp segmentation through swin transformer and vision transformer fusion
    Lijin, P.
    Ullah, Mohib
    Vats, Anuja
    Cheikh, Faouzi Alaya
    Kumar, G. Santhosh
    Nair, Madhu S.
    BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (06) : 1421 - 1431
  • [32] Stellar Classification with Vision Transformer and SDSS Photometric Images
    Yang, Yi
    Li, Xin
    UNIVERSE, 2024, 10 (05)
  • [33] A novel deep learning framework based swin transformer for dermal cancer cell classification
    Ramkumar, K.
    Medeiros, Elias Paulino
    Dong, Ani
    de Albuquerque, Victor Hugo C.
    Hassan, Md Rafiul
    Hassan, Mohammad Mehedi
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [34] EMViT-BCC: Enhanced Mobile Vision Transformer for Breast Cancer Classification
    Potsangbam, Jacinta
    Devi, Salam Shuleenda
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (02)
  • [35] Satellite Images Analysis and Classification using Deep Learning-based Vision Transformer Model
    Adegun, Adekanmi Adeyinka
    Viriri, Serestina
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1275 - 1279
  • [36] Classification of diabetic maculopathy based on optical coherence tomography images using a Vision Transformer model
    Cai, Liwei
    Wen, Chi
    Jiang, Jingwen
    Liang, Congbi
    Zheng, Hongmei
    Su, Yu
    Chen, Changzheng
    BMJ OPEN OPHTHALMOLOGY, 2023, 8 (01):
  • [37] Automated classification of remote sensing satellite images using deep learning based vision transformer
    Adegun, Adekanmi
    Viriri, Serestina
    Tapamo, Jules-Raymond
    APPLIED INTELLIGENCE, 2024, 54 (24) : 13018 - 13037
  • [38] Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on an Improved Swin Transformer
    Sun, Ruina
    Pang, Yuexin
    Li, Wenfa
    ELECTRONICS, 2023, 12 (04)
  • [39] Image recoloring for color vision deficiency compensation using Swin transformer
    Chen, Ligeng
    Zhu, Zhenyang
    Huang, Wangkang
    Go, Kentaro
    Chen, Xiaodiao
    Mao, Xiaoyang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (11): : 6051 - 6066
  • [40] Image recoloring for color vision deficiency compensation using Swin transformer
    Ligeng Chen
    Zhenyang Zhu
    Wangkang Huang
    Kentaro Go
    Xiaodiao Chen
    Xiaoyang Mao
    Neural Computing and Applications, 2024, 36 : 6051 - 6066