Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification

被引:19
|
作者
Almalik, Faris [1 ]
Yaqub, Mohammad [1 ]
Nandakumar, Karthik [1 ]
机构
[1] Mohamed Bin Zayed Univ, Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Adversarial attack; Vision transformer; Self-ensemble;
D O I
10.1007/978-3-031-16437-8_36
中图分类号
R445 [影像诊断学];
学科分类号
100207 ;
摘要
Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and segmentation. While the vulnerability of CNNs to adversarial attacks is a well-known problem, recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. The vulnerability of ViTs to carefully engineered adversarial samples raises serious concerns about their safety in clinical settings. In this paper, we propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks. The proposed Self-Ensembling Vision Transformer (SEViT) leverages the fact that feature representations learned by initial blocks of a ViT are relatively unaffected by adversarial perturbations. Learning multiple classifiers based on these intermediate feature representations and combining these predictions with that of the final ViT classifier can provide robustness against adversarial attacks. Measuring the consistency between the various predictions can also help detect adversarial samples. Experiments on two modalities (chest X-ray and fundoscopy) demonstrate the efficacy of SEViT architecture to defend against various adversarial attacks in the gray-box (attacker has full knowledge of the target model, but not the defense mechanism) setting. Code: https://github.com/faresmalik/SEViT
引用
收藏
页码:376 / 386
页数:11
相关论文
共 50 条
  • [1] Robust Self-Ensembling Network for Hyperspectral Image Classification
    Xu, Yonghao
    Du, Bo
    Zhang, Liangpei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3780 - 3793
  • [2] Weakly-Supervised Self-Ensembling Vision Transformer for MRI Cardiac Segmentation
    Wang, Ziyang
    Mang, Haodong
    Liu, Yang
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 101 - 102
  • [3] Reliability-Aware Contrastive Self-ensembling for Semi-supervised Medical Image Classification
    Hang, Wenlong
    Huang, Yecheng
    Liang, Shuang
    Lei, Baiying
    Choi, Kup-Sze
    Qin, Jing
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT I, 2022, 13431 : 754 - 763
  • [4] Semi-Supervised Medical Image Classification With Relation-Driven Self-Ensembling Model
    Liu, Quande
    Yu, Lequan
    Luo, Luyang
    Dou, Qi
    Heng, Pheng Ann
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (11) : 3429 - 3440
  • [5] MedViT: A robust vision transformer for generalized medical image classification
    Manzari, Omid Nejati
    Ahmadabadi, Hamid
    Kashiani, Hossein
    Shokouhi, Shahriar B.
    Ayatollahi, Ahmad
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [6] Transformation-Consistent Self-Ensembling Model for Semisupervised Medical Image Segmentation
    Li, Xiaomeng
    Yu, Lequan
    Chen, Hao
    Fu, Chi-Wing
    Xing, Lei
    Heng, Pheng-Ann
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (02) : 523 - 534
  • [7] Unsupervised domain adaptation for medical imaging segmentation with self-ensembling
    Perone, Christian S.
    Ballester, Pedro
    Barros, Rodrigo C.
    Cohen-Adad, Julien
    NEUROIMAGE, 2019, 194 : 1 - 11
  • [8] CrisisViT: A Robust Vision Transformer for Crisis Image Classification
    Long, Zijun
    McCreadie, Richard
    Imran, Muhammad
    Proceedings of the International ISCRAM Conference, 2023, 2023-text : 309 - 319
  • [9] Triple Consistency-based Self-ensembling Model for Unsupervised Domain Adaption in Medical Image Segmentation
    Shi, Andrew
    Feng, Wei
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [10] FEDERATED DIVERSE SELF-ENSEMBLING LEARNING APPROACH FOR DATA HETEROGENITY IN DRIVE VISION
    Manimaran, M.
    Dhilipkumar, V.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (06): : 4576 - 4588