HST-MRF: Heterogeneous Swin Transformer With Multi-Receptive Field for Medical Image Segmentation

被引：0

作者：

Huang, Xiaofei ^{[1
]}

Gong, Hongfang ^{[1
]}

Zhang, Jin ^{[2
]}

机构：

[1] Changsha Univ Sci & Technol, Sch Math & Stat, Changsha 410114, Peoples R China

[2] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Peoples R China

来源：

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS | 2024年 / 28卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Image segmentation; Transformers; Biomedical imaging; Task analysis; Computational modeling; Feature extraction; Visualization; Heterogeneous attention; medical imaging segmentation; multi-receptive field; patch segmentation; NETWORK; CONNECTIONS;

D O I：

10.1109/JBHI.2024.3397047

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The Transformer has been successfully used in medical image segmentation due to its excellent long-range modeling capabilities. However, patch segmentation is necessary when building a Transformer class model. This process ignores the tissue structure features within patch, resulting in the loss of shallow representation information. In this study, we propose a Heterogeneous Swin Transformer with Multi-Receptive Field (HST-MRF) model that fuses patch information from different receptive fields to solve the problem of loss of feature information caused by patch segmentation. The heterogeneous Swin Transformer (HST) is the core module, which achieves the interaction of multi-receptive field patch information through heterogeneous attention and passes it to the next stage for progressive learning, thus complementing the patch structure information. We also designed a two-stage fusion module, multimodal bilinear pooling (MBP), to assist HST in further fusing multi-receptive field information and combining low-level and high-level semantic information for accurate localization of lesion regions. In addition, we developed adaptive patch embedding (APE) and soft channel attention (SCA) modules to retain more valuable information when acquiring patch embedding and filtering channel features, respectively, thereby improving model segmentation quality. We evaluated HST-MRF on multiple datasets for polyp, skin lesion and breast ultrasound segmentation tasks. Experimental results show that our proposed method outperforms state-of-the-art models and can achieve superior performance. Furthermore, we verified the effectiveness of each module and the benefits of multi-receptive field segmentation in reducing the loss of structural information through ablation experiments and qualitative analysis.

引用

页码：4048 / 4061

页数：14

共 50 条

[21] Multi-task Swin Transformer for Motion Artifacts Classification and Cardiac Magnetic Resonance Image Segmentation
Grzeszczyk, Michal K.
Plotka, Szymon
Sitek, Arkadiusz
STATISTICAL ATLASES AND COMPUTATIONAL MODELS OF THE HEART: REGULAR AND CMRXMOTION CHALLENGE PAPERS, STACOM 2022, 2022, 13593 : 409 - 417
[22] Grouped multi-scale vision transformer for medical image segmentation
Zexuan Ji
Zheng Chen
Xiao Ma
Scientific Reports, 15 (1)
[23] MultiTrans: Multi-branch transformer network for medical image segmentation
Zhang, Yanhua
Balestra, Gabriella
Zhang, Ke
Wang, Jingyu
Rosati, Samanta
Giannini, Valentina
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 254
[24] Multi-scale and multi-receptive field-based feature fusion for robust segmentation of plant disease and fruit using agricultural images
Haider, Adnan
Arsalan, Muhammad
Hong, Jin Seong
Sultan, Haseeb
Ullah, Nadeem
Park, Kang Ryoung
APPLIED SOFT COMPUTING, 2024, 167
[25] SPECTRALLY-ENFORCED GLOBAL RECEPTIVE FIELD FOR CONTEXTUAL MEDICAL IMAGE SEGMENTATION AND CLASSIFICATION
Li, Yongzhi
Chi, Lu
Tian, Guiyu
Mu, Yadong
Ge, Shen
Qiao, Zhi
Wu, Xian
Fan, Wei
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[26] MFSFFuse: Multi-receptive Field Feature Extraction for Infrared and Visible Image Fusion Using Self-supervised Learning
Gao, Xueyan
Liu, Shiguang
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI, 2024, 14452 : 118 - 132
[27] MS-UNet: Swin Transformer U-Net with Multi-scale Nested Decoder for Medical Image Segmentation with Small Training Data
Chen, Haoyuan
Han, Yufei
Li, Yanyi
Xu, Pin
Li, Kuan
Yin, Jianping
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 472 - 483
[28] Swin-MFA: A Multi-Modal Fusion Attention Network Based on Swin-Transformer for Low-Light Image Human Segmentation
Yi, Xunpeng
Zhang, Haonan
Wang, Yibo
Guo, Shujiang
Wu, Jingyi
Fan, Cien
SENSORS, 2022, 22 (16)
[29] Enhancing medical image segmentation with a multi-transformer U-Net
Dan, Yongping
Jin, Weishou
Yue, Xuebin
Wang, Zhida
PEERJ, 2024, 12
[30] MESTrans: Multi-scale embedding spatial transformer for medical image segmentation
Liu, Yatong
Zhu, Yu
Xin, Ying
Zhang, Yanan
Yang, Dawei
Xu, Tao
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 233

← 1 2 3 4 5 →