Multi-scale nested UNet with transformer for colorectal polyp segmentation

被引:5
|
作者
Wang, Zenan [1 ]
Liu, Zhen [1 ]
Yu, Jianfeng [1 ]
Gao, Yingxin [1 ]
Liu, Ming [2 ]
机构
[1] Capital Med Univ, Beijing Chaoyang Hosp, Dept Gastroenterol, Clin Med Coll 3, Beijing, Peoples R China
[2] Hunan Key Lab Nonferrous Resources & Geol Hazard E, Changsha, Peoples R China
来源
关键词
colorectal polyp; deep learning; polyp segmentation; transformer; MISS RATE; COLONOSCOPY;
D O I
10.1002/acm2.14351
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
BackgroundPolyp detection and localization are essential tasks for colonoscopy. U-shape network based convolutional neural networks have achieved remarkable segmentation performance for biomedical images, but lack of long-range dependencies modeling limits their receptive fields.PurposeOur goal was to develop and test a novel architecture for polyp segmentation, which takes advantage of learning local information with long-range dependencies modeling.MethodsA novel architecture combining with multi-scale nested UNet structure integrated transformer for polyp segmentation was developed. The proposed network takes advantage of both CNN and transformer to extract distinct feature information. The transformer layer is embedded between the encoder and decoder of a U-shape net to learn explicit global context and long-range semantic information. To address the challenging of variant polyp sizes, a MSFF unit was proposed to fuse features with multiple resolution.ResultsFour public datasets and one in-house dataset were used to train and test the model performance. Ablation study was also conducted to verify each component of the model. For dataset Kvasir-SEG and CVC-ClinicDB, the proposed model achieved mean dice score of 0.942 and 0.950 respectively, which were more accurate than the other methods. To show the generalization of different methods, we processed two cross dataset validations, the proposed model achieved the highest mean dice score. The results demonstrate that the proposed network has powerful learning and generalization capability, significantly improving segmentation accuracy and outperforming state-of-the-art methods.ConclusionsThe proposed model produced more accurate polyp segmentation than current methods on four different public and one in-house datasets. Its capability of polyps segmentation in different sizes shows the potential clinical application
引用
收藏
页数:10
相关论文
共 50 条
  • [41] PSTNet: Enhanced Polyp Segmentation With Multi-Scale Alignment and Frequency Domain Integration
    Xu, Wenhao
    Xu, Rongtao
    Wang, Changwei
    Li, Xiuli
    Xu, Shibiao
    Guo, Li
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (10) : 6042 - 6053
  • [42] Enhancing medical image segmentation with MA-UNet: a multi-scale attention framework
    Li, Hongzhi
    Ren, Zhanghao
    Zhu, Guoqing
    Liang, Yaoju
    Cui, Han
    Wang, Chaozeyu
    Wang, Jiaxi
    VISUAL COMPUTER, 2025,
  • [43] Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
    Gu, Jiaqi
    Kwon, Hyoukjun
    Wang, Dilin
    Ye, Wei
    Li, Meng
    Chen, Yu-Hsin
    Lai, Liangzhen
    Chandra, Vikas
    Pan, David Z.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12084 - 12093
  • [44] MSEDTNet: Multi-Scale Encoder and Decoder with Transformer for Bladder Tumor Segmentation
    Wang, Yixing
    Ye, Xiufen
    ELECTRONICS, 2022, 11 (20)
  • [45] MESTrans: Multi-scale embedding spatial transformer for medical image segmentation
    Liu, Yatong
    Zhu, Yu
    Xin, Ying
    Zhang, Yanan
    Yang, Dawei
    Xu, Tao
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 233
  • [46] Multi-Scale Liver Tumor Segmentation Algorithm by Fusing Convolution and Transformer
    Chen, Lifang
    Luo, Shiyong
    Computer Engineering and Applications, 2024, 60 (04) : 270 - 279
  • [47] Hierarchical Transformer with Multi-Scale Parallel Aggregation for Breast Tumor Segmentation
    Xia, Ping
    Wang, Yudie
    Lei, Bangjun
    Peng, Cheng
    Zhang, Guangyi
    Tang, Tinglong
    LASER & OPTOELECTRONICS PROGRESS, 2025, 62 (02)
  • [48] MUSTER: A Multi-Scale Transformer-Based Decoder for Semantic Segmentation
    Xu, Jing
    Shi, Wentao
    Gao, Pan
    Li, Qizhu
    Wang, Zhengwei
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 202 - 212
  • [49] MDAN-UNet: Multi-Scale and Dual Attention Enhanced Nested U-Net Architecture for Segmentation of Optical Coherence Tomography Images
    Liu, Wen
    Sun, Yankui
    Ji, Qingge
    ALGORITHMS, 2020, 13 (03)
  • [50] MS-UNet: A multi-scale UNet with feature recalibration approach for automatic liver and tumor segmentation in CT images
    Kushnure, Devidas T.
    Talbar, Sanjay N.
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2021, 89