Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification

被引:19
|
作者
Almalik, Faris [1 ]
Yaqub, Mohammad [1 ]
Nandakumar, Karthik [1 ]
机构
[1] Mohamed Bin Zayed Univ, Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Adversarial attack; Vision transformer; Self-ensemble;
D O I
10.1007/978-3-031-16437-8_36
中图分类号
R445 [影像诊断学];
学科分类号
100207 ;
摘要
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and segmentation. While the vulnerability of CNNs to adversarial attacks is a well-known problem, recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. The vulnerability of ViTs to carefully engineered adversarial samples raises serious concerns about their safety in clinical settings. In this paper, we propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks. The proposed Self-Ensembling Vision Transformer (SEViT) leverages the fact that feature representations learned by initial blocks of a ViT are relatively unaffected by adversarial perturbations. Learning multiple classifiers based on these intermediate feature representations and combining these predictions with that of the final ViT classifier can provide robustness against adversarial attacks. Measuring the consistency between the various predictions can also help detect adversarial samples. Experiments on two modalities (chest X-ray and fundoscopy) demonstrate the efficacy of SEViT architecture to defend against various adversarial attacks in the gray-box (attacker has full knowledge of the target model, but not the defense mechanism) setting. Code: https://github.com/faresmalik/SEViT
引用
收藏
页码:376 / 386
页数:11
相关论文
共 50 条
  • [31] Image Quality Distortion Classification Using Vision Transformer
    Lynn, Nay Chi
    Shimamura, Tetsuya
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 1, AINA 2024, 2024, 199 : 353 - 361
  • [32] HYBRID VISION TRANSFORMER MODEL FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Yang, Jiaqi
    Du, Bo
    Wu, Chen
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1388 - 1391
  • [33] Boosted Nutcracker optimizer and Chaos Game Optimization with Cross Vision Transformer for medical image classification
    Mohamed, Ahmed F.
    Saba, Amal
    Hassan, Mohamed K.
    Youssef, Hamdy. M.
    Dahou, Abdelghani
    Elsheikh, Ammar H.
    El-Bary, Alaa A.
    Abd Elaziz, Mohamed
    Ibrahim, Rehab Ali
    EGYPTIAN INFORMATICS JOURNAL, 2024, 14 (01)
  • [34] A Comparative Study of Vision Transformer Encoders and Few-shot Learning for Medical Image Classification
    Nurgazin, Maxat
    Nguyen Anh Tu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 2505 - 2513
  • [35] Deep Learning and Vision Transformer for Medical Image Analysis
    Zhang, Yudong
    Wang, Jiaji
    Gorriz, Juan Manuel
    Wang, Shuihua
    JOURNAL OF IMAGING, 2023, 9 (07)
  • [36] MetaV: A Pioneer in feature Augmented Meta-Learning Based Vision Transformer for Medical Image Classification
    Ansari, Shaharyar Alam
    Agrawal, Arun Prakash
    Wajid, Mohd Anas
    Wajid, Mohammad Saif
    Zafar, Aasim
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024, 16 (02) : 469 - 488
  • [37] Privacy-Preserving Image Classification Using Vision Transformer
    Qi, Zheng
    MaungMaung, AprilPyone
    Kinoshita, Yuma
    Kiya, Hitoshi
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
  • [38] Vision Transformer with window sequence merging mechanism for image classification
    Jiao, Erjie
    Leng, Qiangkui
    Guo, Jiamei
    Meng, Xiangfu
    Wang, Changzhong
    APPLIED SOFT COMPUTING, 2025, 171
  • [39] Survey of Vision Transformer in Fine-Grained Image Classification
    Sun, Lulu
    Liu, Jianping
    Wang, Jian
    Xing, Jialu
    Zhang, Yue
    Wang, Chenyang
    Computer Engineering and Applications, 60 (10): : 30 - 46
  • [40] Hierarchical Pretrained Backbone Vision Transformer for Image Classification in Histopathology
    Zedda, Luca
    Loddo, Andrea
    Di Ruberto, Cecilia
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 223 - 234