Dual Encoder Decoder Shifted Window-Based Transformer Network for Polyp Segmentation with Self-Learning Approach

被引:1
|
作者
P. L. [1 ,7 ]
Ullah M. [3 ]
Vats A. [3 ]
Cheikh F.A. [3 ,5 ]
Kumar G. S. [1 ,7 ]
Nair M.S. [1 ,7 ]
机构
[1] Computer Vision Lab, Department of Computer Science, Cochin University of Science and Technology, Kochi, Kerala
[2] Computer Vision Lab, Department of Computer Science, Cochin University of Science and Technology, Kochin, Kerala
来源
关键词
Barlow twins; Colonoscopy; Computational modeling; Computer architecture; Convolutional neural networks; convolutional neural networks (CNN); Decoding; dilated convolution; Image segmentation; polyp segmentation; Transformers;
D O I
10.1109/TAI.2024.3366146
中图分类号
学科分类号
摘要
According to WHO reports, cancer is the leading cause of death worldwide. The second most prevalent cause of cancer-related death in both men and women is colorectal cancer. One potential approach for reducing the severity of colon cancer is to utilize automatic segmentation and detection of colorectal polyps in colonoscopy videos. This technology can assist endoscopists in quickly identifying colorectal disease, leading to earlier intervention and better patient Quality of Life (QoL). In this paper, we propose a self-supervised transformer based dual encoder-decoder architecture named P-SwinNet for polyps segmentation in colonoscopy images. The P-SwinNet adapts the dual encoder-decoder type of model to enhance the feature maps by sharing multiscale information from the encoder to the decoder. The proposed model uses multiple dilated convolutions to enlarge the field of view to gather more information without increasing the computational cost and the loss of spatial information. We also leverage a large-scale unlabelled dataset for training our model using the self-learning strategy of Barlow twins. Additionally, to capture the long-range dependencies in the data, we used a shift window-based approach that computes global attention. We extensively evaluate our model against state-of-the-art algorithms. The quantitative results show that the proposed P-SwinNet achieves a mean dice score of 0.87 and a mean Intersection over Union (IoU) of 0.82 on five datasets used in our study. This performance demonstrates a substantial advancement over existing similar works, highlighting the advantage and novelty of our proposed approach in the field of medical image segmentation. IEEE
引用
收藏
页码:1 / 14
页数:13
相关论文
共 31 条
  • [1] Dual encoder-decoder-based deep polyp segmentation network for colonoscopy images
    Lewis, John
    Cha, Young-Jin
    Kim, Jongho
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [2] Dual encoder–decoder-based deep polyp segmentation network for colonoscopy images
    John Lewis
    Young-Jin Cha
    Jongho Kim
    Scientific Reports, 13
  • [3] Effectiveness of encoder-decoder deep learning approach for colorectal polyp segmentation in colonoscopy images
    Hamza, Ameer
    Bilal, Muhammad
    Ramzan, Muhammad
    Malik, Nadia
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [4] Robust pavement crack segmentation network based on transformer and dual-branch decoder
    Yu, Zhenwei
    Chen, Qinyu
    Shen, Yonggang
    Zhang, Yiping
    CONSTRUCTION AND BUILDING MATERIALS, 2024, 453
  • [5] A point cloud self-learning network based on contrastive learning for classification and segmentation
    Zhou, Haoran
    Wang, Wenju
    Chen, Gang
    Wang, Xiaolin
    VISUAL COMPUTER, 2024, 40 (12): : 8455 - 8479
  • [6] Dual-TranSpeckle: Dual-pathway transformer based encoder-decoder network for medical ultrasound image despeckling
    Chen Y.
    Guo Z.
    Yuan J.
    Li X.
    Yu H.
    Computers in Biology and Medicine, 2024, 173
  • [7] A Self-Learning Channel Modeling Approach Based on Explainable Neural Network
    Xue, Pengfei
    Zhao, Youping
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2023, 12 (07) : 1289 - 1293
  • [8] Coronary artery segmentation based on Transformer and convolutional neural networks dual parallel branch encoder neural network
    Pan, Dan
    Luo, Genqiang
    Zeng, An
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2024, 41 (06): : 1195 - 1203
  • [9] Robust Air Target Intention Recognition Based on Weight Self-Learning Parallel Time-Channel Transformer Encoder
    Song, Zihao
    Zhou, Yan
    Cheng, Wei
    Liang, Futai
    Zhang, Chenhao
    IEEE ACCESS, 2023, 11 : 144760 - 144777
  • [10] Diverter transformer-based multi-encoder-multi-decoder network model for medical retinal blood vessel image segmentation
    Wu, Chengwei
    Guo, Min
    Ma, Miao
    Wang, Kaiguang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 93