MSHT: Multi-Stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer

被引:8
|
作者
Zhang, Tianyi [1 ]
Feng, Yunlu [2 ]
Zhao, Yu [3 ]
Fan, Guangda [1 ]
Yang, Aiming [2 ]
Lyu, Shangqing [4 ]
Zhang, Peng [1 ]
Song, Fan [1 ]
Ma, Chenbin [1 ]
Sun, Yangyang [1 ]
Feng, Youdan [1 ]
Zhang, Guanglei [1 ]
机构
[1] Beihang Univ, Beijing Adv Innovat Ctr Biomed Engn, Sch Biol Sci & Med Engn, Beijing 100191, Peoples R China
[2] Peking Union Med Coll Hosp, Dept Gastroenterol, Beijing 100006, Peoples R China
[3] Peking Union Med Coll Hosp, Dept Pathol, Beijing 100006, Peoples R China
[4] Univ Southampton, Sch Elect & Comp Sci, Southampton SO17 1BJ, Hampshire, England
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Transformers; Feature extraction; Convolutional neural networks; Pancreatic cancer; Cancer; Image analysis; Solid modeling; Cytopathology; deep learning; pancreatic cancer; rapid on-site evaluation (ROSE); Transformer; FINE-NEEDLE-ASPIRATION; EUS-FNA; DIAGNOSTIC-ACCURACY; CYTOLOGY; IMPROVE;
D O I
10.1109/JBHI.2023.3234289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Pancreatic cancer is one of the most malignant cancers with high mortality. The rapid on-site evaluation (ROSE) technique can significantly accelerate the diagnostic workflow of pancreatic cancer by immediately analyzing the fast-stained cytopathological images with on-site pathologists. However, the broader expansion of ROSE diagnosis has been hindered by the shortage of experienced pathologists. Deep learning has great potential for the automatic classification of ROSE images in diagnosis. But it is challenging to model the complicated local and global image features. The traditional convolutional neural network (CNN) structure can effectively extract spatial features, while it tends to ignore global features when the prominent local features are misleading. In contrast, the Transformer structure has excellent advantages in capturing global features and long-range relations, while it has limited ability in utilizing local features. We propose a multi-stage hybrid Transformer (MSHT) to combine the strengths of both, where a CNN backbone robustly extracts multi-stage local features at different scales as the attention guidance, and a Transformer encodes them for sophisticated global modeling. Going beyond the strength of each single method, the MSHT can simultaneously enhance the Transformer global modeling ability with the local guidance from CNN features. To evaluate the method in this unexplored field, a dataset of 4240 ROSE images is collected where MSHT achieves 95.68% in classification accuracy with more accurate attention regions. The distinctively superior results compared to the state-of-the-art models make MSHT extremely promising for cytopathological image analysis.
引用
收藏
页码:1946 / 1957
页数:12
相关论文
共 50 条
  • [1] IMIHCT: improved multi-stage image inpainting with hybrid CNN and transformer
    Ning, Tao
    Wang, Xingfang
    Ding, Hongwei
    PATTERN ANALYSIS AND APPLICATIONS, 2025, 28 (01)
  • [2] A Multi-Stage Transformer Network for Image Dehazing Based on Contrastive Learning
    Gao F.
    Ji S.
    Guo J.
    Hou J.
    Ouyang C.
    Yang B.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2023, 57 (01): : 195 - 210
  • [3] Multi-stage feature aggregation transformer for image rain and haze joint removal
    Xia, Zhengran
    Dai, Lei
    Chen, Zhihua
    Chen, Kai
    Li, Ran
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 149
  • [4] Hyperspectral Image Classification Based on Multi-stage Vision Transformer with Stacked Samples
    Chen, Xiaoyue
    Kamata, Sei-Ichiro
    Zhou, Weilian
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 441 - 446
  • [5] The Dynamics Analysis of a Multi-stage hybrid Planetary Gearing
    Tan Xin
    Li Yao
    Yang Junjie
    MATERIALS PROCESSING TECHNOLOGY II, PTS 1-4, 2012, 538-541 : 2631 - 2635
  • [6] Multi-Stage Vision Transformer for Batik Classification
    Setyawan, Novendra
    Achmadiah, Mas Nurul
    Sun, Chi-Chia
    Kuo, Wen-Kai
    2024 INTERNATIONAL ELECTRONICS SYMPOSIUM, IES 2024, 2024, : 449 - 453
  • [7] MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion
    Wang, Jing
    Fan, Haotian
    Hou, Xiaoxia
    Xu, Yitian
    Li, Tao
    Lu, Xuechao
    Fu, Lean
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1268 - 1277
  • [8] MSTNet: a multi-stage progressive network with local–global transformer fusion for image restoration
    Ruyu Liu
    Lin Wang
    Jie He
    Jiajia Wang
    Jianhua Zhang
    Xiufeng Liu
    Chaochao Wang
    Haoyu Zhang
    Sheng Dai
    Complex & Intelligent Systems, 2025, 11 (6)
  • [9] A Multi-Stage Visual Perception Approach for Image Emotion Analysis
    Pan, Jicai
    Lu, Jinqiao
    Wang, Shangfei
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (03) : 1786 - 1799
  • [10] Multi-Stage Progressive Image Restoration
    Zamir, Syed Waqas
    Arora, Aditya
    Khan, Salman
    Hayat, Munawar
    Khan, Fahad Shahbaz
    Yang, Ming-Hsuan
    Shao, Ling
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14816 - 14826