An enhanced vision transformer with wavelet position embedding for histopathological image classification

被引:14
|
作者
Ding, Meidan [1 ]
Qu, Aiping [1 ]
Zhong, Haiqin [1 ]
Lai, Zhihui [2 ,3 ]
Xiao, Shuomin [1 ]
He, Penghui [1 ]
机构
[1] Univ South China, Sch Comp, Hengyang 421001, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[3] Robot Soc, Shenzhen Inst Artificial Intelligence, Shenzhen 518129, Peoples R China
关键词
Histopathological image classification; Vision transformer; Convolutional neural network; Wavelet position embedding; External multi-head attention;
D O I
10.1016/j.patcog.2023.109532
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Histopathological image classification is a fundamental task in pathological diagnosis workflow. It remains a huge challenge due to the complexity of histopathological images. Recently, hybrid methods combin-ing convolutional neural networks(CNN) with vision transformers(ViT) are proposed to this field. These methods can well represent the global and local contextual information and achieve excellent classifica-tion performances. However, the downsampling operation like max-pooling which ignores the sampling theorem transmits the jagged artifacts into transformer, which would lead to an aliasing phenomenon. It makes the subsequent feature maps focus on the incorrect regions and influences the final classifica-tion results. In this work, we propose an enhanced vision transformer with wavelet position embedding to tackle this challenge. In particular, a wavelet position embedding module, which introduces the wave transform into position embedding, is employed to enhance the smoothness of discontinuous feature in-formation by decomposing sequences into amplitude and phase in pathological feature maps. In addition, an external multi-head attention is proposed to replace self-attention in the transformer block with two linear layers. It reduces the cost of computation and excavates potential correlations between different samples. We evaluate the proposed method on three public histopathological classification challenging datasets, and perform a quantitative comparison with previous state-of-the-art methods. The results em-pirically demonstrate that our method achieves the best accuracy. Furthermore, it has the least param-eters and a very low FLOPs. In conclusion, the enhanced vision transformer shows high classification performances and demonstrates significant potential for assisting pathologists in pathological diagnosis.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
    Shiri, Mohammad
    Reddy, Monalika Padma
    Sun, Jiangwen
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301
  • [2] Rotary Position Embedding for Vision Transformer
    Heo, Byeongho
    Park, Song
    Han, Dongyoon
    Yun, Sangdoo
    COMPUTER VISION - ECCV 2024, PT X, 2025, 15068 : 289 - 305
  • [3] Vision Transformer Based Tokenization for Enhanced Breast Cancer Histopathological Images Classification
    Abimouloud, Mouhamed Laid
    Bensid, Khaled
    Elleuch, Mohamed
    Aiadi, Oussama
    Kherallah, Monji
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT I, AIAI 2024, 2024, 711 : 255 - 267
  • [4] RoFormer: Enhanced transformer with Rotary Position Embedding
    Su, Jianlin
    Ahmed, Murtadha
    Lu, Yu
    Pan, Shengfeng
    Bo, Wen
    Liu, Yunfeng
    NEUROCOMPUTING, 2024, 568
  • [5] Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels
    Gul, Ahmet Gokberk
    Cetin, Oezdemir
    Reich, Christoph
    Flinner, Nadine
    Prangemeier, Tim
    Koeppl, Heinz
    MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
  • [6] The Application of Vision Transformer in Image Classification
    He, Zhixuan
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63
  • [8] IEViT: An enhanced vision transformer architecture for chest X-ray image classification
    Okolo, Gabriel Iluebe
    Katsigiannis, Stamos
    Ramzan, Naeem
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 226
  • [9] Cross-Scale Fusion Transformer for Histopathological Image Classification
    Huang, Sheng-Kai
    Yu, Yu-Ting
    Huang, Chun-Rong
    Cheng, Hsiu-Chi
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (01) : 297 - 308
  • [10] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
    Zhou, Heng
    Zhang, Xin
    Zhang, Chunlei
    Ma, Qiaoyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20