seUNet-Trans: A Simple Yet Effective UNet-Transformer Model for Medical Image Segmentation

被引:4
|
作者
Pham, Tan-Hanh [1 ]
Li, Xianqi [2 ]
Nguyen, Kim-Doang [1 ]
机构
[1] Florida Inst Technol, Dept Mech & Aerosp Engn, Melbourne, FL 32901 USA
[2] Florida Inst Technol, Dept Math & Syst Engn, Melbourne, FL 32901 USA
来源
IEEE ACCESS | 2024年 / 12卷
基金
美国农业部;
关键词
Transformers; Image segmentation; Medical diagnostic imaging; Decoding; Computer architecture; Colonoscopy; Biomedical imaging; Deep learning; Polyps; colonoscopy; medical image analysis; deep learning; vision transformers; ATTENTION;
D O I
10.1109/ACCESS.2024.3451304
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical image segmentation plays a crucial role in modern clinical practice, enabling accurate diagnosis and personalized treatment plans. Advancements in machine learning, particularly deep learning techniques, have significantly driven this progress. While Convolutional Neural Networks (CNNs) dominate the field, transformer-based models are emerging as powerful alternatives for computer vision tasks. However, most existing CNN-Transformer models underutilize the full potential of Transformers, often relegating them to assistant modules. To address this issue, we propose a novel and efficient UNet-Transformer (seUNet-Trans) model for medical image segmentation. The seUNet-Trans framework leverages a UNet architecture for feature extraction, generating rich representations from input images. These features are then passed through a bridge layer that connects the UNet to a transformer module. To improve efficiency, we employ a novel pixel-wise embedding method that eliminates the need for position embedding vectors. We utilize spatially reduced attention within the transformer to reduce computational complexity. By combining the strengths of UNet's localization capabilities and the transformer's ability to capture long-range dependencies, seUNet-Trans effectively captures both local and global information within medical images. This holistic understanding enables the model to achieve superior segmentation performance. The efficacy of our model is demonstrated through extensive experimentation on seven medical image segmentation datasets. The seUNet-Trans model outperforms several state-of-the-art segmentation models, achieving impressive mean Dice Coefficient (mDC) and mean Intersection over Union (mIoU) scores. On the CVC-ClinicDB dataset, it achieves scores of 0.945 and 0.895, respectively; on the GlaS dataset, it scores 0.899 and 0.823, respectively; on the ISIC 2018 dataset, it achieves 0.922 and 0.854, respectively; and on the Data Science Bowl dataset, it scores 0.928 and 0.867, respectively. The code is available on seUnet-Trans.
引用
收藏
页码:122139 / 122154
页数:16
相关论文
共 50 条
  • [31] CoT-UNet plus plus : A medical image segmentation method based on contextual transformer and dense connection
    Yin, Yijun
    Xu, Wenzheng
    Chen, Lei
    Wu, Hao
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (05) : 8320 - 8336
  • [32] DTBNet: Medical image segmentation model based on dual transformer bridge
    Wang, Yuli (wyl@qlu.edu.cn), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [33] MetaSwin: a unified meta vision transformer model for medical image segmentation
    Lee, Soyeon
    Lee, Minhyeok
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 17
  • [34] MetaSwin: a unified meta vision transformer model for medical image segmentation
    Lee, Soyeon
    Lee, Minhyeok
    PeerJ Computer Science, 2024, 10 : 1 - 17
  • [35] High Effective Medical Image Segmentation with Model Adjustable Method
    Yao, Yiwu
    Cheng, Yuhua
    2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 1512 - 1515
  • [36] Deep Model Reference: Simple Yet Effective Confidence Estimation for Image Classification
    Zheng, Yuanhang
    Qiu, Yiqiao
    Che, Haoxuan
    Chen, Hao
    Zheng, Wei-Shi
    Wang, Ruixuan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X, 2024, 15010 : 175 - 185
  • [37] GSAC-UFormer: Groupwise Self-Attention Convolutional Transformer-Based UNet for Medical Image Segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    El Ansari, Mohamed
    Koutti, Lahcen
    Salihoun, Mouna
    COGNITIVE COMPUTATION, 2025, 17 (02)
  • [38] TUnet-LBF: Retinal fundus image fine segmentation model based on transformer Unet network and LBF
    Zhang, Hanyu
    Ni, Weihan
    Luo, Yi
    Feng, Yining
    Song, Ruoxi
    Wang, Xianghai
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 159
  • [39] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Cai, Yimin
    Long, Yuqing
    Han, Zhenggong
    Liu, Mingkun
    Zheng, Yuchen
    Yang, Wei
    Chen, Liming
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [40] DMFC-UFormer: Depthwise multi-scale factorized convolution transformer-based UNet for medical image segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    El Ansari, Mohamed
    Koutti, Lahcen
    Salihoun, Mouna
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101