Diffusion model-based text-guided enhancement network for medical image segmentation

被引:7
|
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
    Nam, Hyelin
    Kwon, Gihyun
    Park, Geon Yeong
    Ye, Jong Chul
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 9192 - 9201
  • [22] Text-Guided Neural Image Inpainting
    Zhang, Lisai
    Chen, Qingcai
    Hu, Baotian
    Jiang, Shuoran
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1302 - 1310
  • [23] D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection
    Hirahara, Kentaro
    Nakane, Chikahito
    Ebisawa, Hajime
    Kuroda, Tsuyoshi
    Iwaki, Yohei
    Utsumi, Tomoyoshi
    Nomura, Yuichiro
    Koike, Makoto
    Mineno, Hiroshi
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 230
  • [24] Text-Guided Portrait Image Matting
    Xu Y.
    Yao X.
    Liu B.
    Quan Y.
    Ji H.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 1 - 13
  • [25] Text-guided small molecule generation via diffusion model
    Luo, Yanchen
    Fang, Junfeng
    Li, Sihang
    Liu, Zhiyuan
    Wu, Jiancan
    Zhang, An
    Du, Wenjie
    Wang, Xiang
    ISCIENCE, 2024, 27 (11)
  • [26] Text-guided visual representation learning for medical image retrieval systems
    Serieys, Guillaume
    Kurtz, Camille
    Fournier, Laure
    Cloppet, Florence
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 593 - 598
  • [27] Text-Guided Foundation Model Adaptation for Pathological Image Classification
    Zhang, Yunkun
    Gao, Jin
    Zhou, Mu
    Wang, Xiaosong
    Qiao, Yu
    Zhang, Shaoting
    Wang, Dequan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 272 - 282
  • [28] A TEXT-GUIDED GRAPH STRUCTURE FOR IMAGE CAPTIONING
    Wang, Depeng
    Hu, Zhenzhen
    Zhou, Yuanen
    Liu, Xueliang
    Wu, Le
    Hong, Richang
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,
  • [29] Bimodal text-guided image inpainting algorithm
    Li H.
    Chen J.
    Yu P.
    Li H.
    Zhang Y.
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (10): : 2547 - 2557
  • [30] TIC: text-guided image colorization using conditional generative model
    Ghosh, Subhankar
    Roy, Prasun
    Bhattacharya, Saumik
    Pal, Umapada
    Blumenstein, Michael
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 41121 - 41136