Self-Ensembling Vision Transformer (SEViT) for Robust Medical Image Classification

被引：19

作者：

Almalik, Faris ^{[1
]}

Yaqub, Mohammad ^{[1
]}

Nandakumar, Karthik ^{[1
]}

机构：

[1] Mohamed Bin Zayed Univ, Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III | 2022年 / 13433卷

关键词：

Adversarial attack; Vision transformer; Self-ensemble;

D O I：

10.1007/978-3-031-16437-8_36

中图分类号：

R445 [影像诊断学];

学科分类号：

100207 ;

摘要：

Vision Transformers (ViT) are competing to replace Convolutional Neural Networks (CNN) for various computer vision tasks in medical imaging such as classification and segmentation. While the vulnerability of CNNs to adversarial attacks is a well-known problem, recent works have shown that ViTs are also susceptible to such attacks and suffer significant performance degradation under attack. The vulnerability of ViTs to carefully engineered adversarial samples raises serious concerns about their safety in clinical settings. In this paper, we propose a novel self-ensembling method to enhance the robustness of ViT in the presence of adversarial attacks. The proposed Self-Ensembling Vision Transformer (SEViT) leverages the fact that feature representations learned by initial blocks of a ViT are relatively unaffected by adversarial perturbations. Learning multiple classifiers based on these intermediate feature representations and combining these predictions with that of the final ViT classifier can provide robustness against adversarial attacks. Measuring the consistency between the various predictions can also help detect adversarial samples. Experiments on two modalities (chest X-ray and fundoscopy) demonstrate the efficacy of SEViT architecture to defend against various adversarial attacks in the gray-box (attacker has full knowledge of the target model, but not the defense mechanism) setting. Code: https://github.com/faresmalik/SEViT

引用

页码：376 / 386

页数：11

共 50 条

[21] Improving vision transformer for medical image classification via token-wise perturbation
Li, Yuexiang
Huang, Yawen
He, Nanjun
Ma, Kai
Zheng, Yefeng
Journal of Visual Communication and Image Representation, 2024, 98
[22] Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels
Gul, Ahmet Gokberk
Cetin, Oezdemir
Reich, Christoph
Flinner, Nadine
Prangemeier, Tim
Koeppl, Heinz
MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
[23] MAT-VIT:A Vision Transformer with MAE-Based Self-Supervised Auxiliary Task for Medical Image Classification
Han, Yufei
Chen, Haoyuan
Yao, Linwei
Li, Kuan
Yin, Jianping
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2040 - 2046
[24] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
Zhou, Heng
Zhang, Xin
Zhang, Chunlei
Ma, Qiaoyu
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[25] CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification
He, Wenxuan
Huang, Weiliang
Liao, Shuhong
Xu, Zhen
Yan, Jingwen
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9266 - 9277
[26] Hyperspectral image classification with embedded linear vision transformer
Tan, Yunfei
Li, Ming
Yuan, Longfa
Shi, Chaoshan
Luo, Yonghang
Wen, Guihao
EARTH SCIENCE INFORMATICS, 2025, 18 (01)
[27] Image Classification Using Vision Transformer for EtC Images
Hamano, Genki
Imaizumi, Shoko
Kiya, Hitoshi
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1506 - 1513
[28] Compressed-Domain Vision Transformer for Image Classification
Ji, Ruolei
Karam, Lina J.
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 299 - 310
[29] Vision Transformer (ViT)-based Applications in Image Classification
Huo, Yingzi
Jin, Kai
Cai, Jiahong
Xiong, Huixuan
Pang, Jiacheng
2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS, 2023, : 135 - 140
[30] CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification
He, Wenxuan
Huang, Weiliang
Liao, Shuhong
Xu, Zhen
Yan, Jingwen
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15 : 9266 - 9277

← 1 2 3 4 5 →