Diffusion model-based text-guided enhancement network for medical image segmentation

被引:7
|
作者
Dong, Zhiwei [1 ]
Yuan, Genji [1 ]
Hua, Zhen [1 ]
Li, Jinjiang [2 ]
机构
[1] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai, Peoples R China
[2] Shandong Technol & Business Univ, Sch Informat & Elect Engn, Yantai, Peoples R China
基金
中国国家自然科学基金;
关键词
Denoising diffusion model; Text attention mechanism; Guided feature enhancement; Medical image segmentation; CONVOLUTIONAL NEURAL-NETWORK; CELL-NUCLEI; MISDIAGNOSIS; CLASSIFICATION;
D O I
10.1016/j.eswa.2024.123549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, denoising diffusion models have achieved remarkable success in generating pixel-level representations with semantic values for image generation modeling. In this study, we propose a novel end -toend framework, called TGEDiff, focusing on medical image segmentation. TGEDiff fuses a textual attention mechanism with the diffusion model by introducing an additional auxiliary categorization task to guide the diffusion model with textual information to generate excellent pixel-level representations. To overcome the limitation of limited perceptual fields for independent feature encoders within the diffusion model, we introduce a multi-kernel excitation module to extend the model's perceptual capability. Meanwhile, a guided feature enhancement module is introduced in Denoising-UNet to focus the model's attention on important regions and attenuate the influence of noise and irrelevant background in medical images. We critically evaluated TGEDiff on three datasets (Kvasir-SEG, Kvasir-Sessile, and GLaS), and TGEDiff achieved significant improvements over the state -of -the -art approach on all three datasets, with F1 scores and mIoU improving by 0.88% and 1.09%, 3.21% and 3.43%, respectively, 1.29% and 2.34%. These data validate that TGEDiff has excellent performance in medical image segmentation. TGEDiff is expected to facilitate accurate diagnosis and treatment of medical diseases through more precise deconvolutional structural segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Text-Guided Sketch-to-Photo Image Synthesis
    Osahor, Uche
    Nasrabadi, Nasser M.
    IEEE ACCESS, 2022, 10 : 98278 - 98289
  • [42] TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
    Cao, Tianshi
    Kreis, Karsten
    Fidler, Sanja
    Sharp, Nicholas
    Yin, Kangxue
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4146 - 4158
  • [43] Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer
    Yang, Serin
    Hwang, Hyunmin
    Ye, Jong Chul
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22816 - 22825
  • [44] TUMSyn: A Text-Guided Generalist Model for Customized Multimodal MR Image Synthesis
    Wang, Yulin
    Xiong, Honglin
    Xie, Yi
    Liu, Jiameng
    Wang, Qian
    Liu, Qian
    Shen, Dinggang
    FOUNDATION MODELS FOR GENERAL MEDICAL AI, MEDAGI 2024, 2025, 15184 : 124 - 133
  • [45] A Medical Image Segmentation Network with Boundary Enhancement
    Sun Junmei
    Ge Qingqing
    Li Xiumei
    Zhao Baoqi
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2022, 44 (05) : 1643 - 1652
  • [46] Learning semantic alignment from image for text-guided image inpainting
    Yucheng Xie
    Zehang Lin
    Zhenguo Yang
    Huan Deng
    Xingcai Wu
    Xudong Mao
    Qing Li
    Wenyin Liu
    The Visual Computer, 2022, 38 : 3149 - 3161
  • [47] Hardware Resilience Properties of Text-Guided Image Classifiers
    Wasim, Syed Talal
    Soboka, Kabila Haile
    Mahmoud, Abdulrahman
    Khan, Salman
    Brooks, David
    Wei, Gu-Yeon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] FocusGAN: Preserving Background in Text-Guided Image Editing
    Zhao, Liuqing
    Li, Linyan
    Hu, Fuyuan
    Xia, Zhenping
    Yao, Rui
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (16)
  • [49] DiCTI: Diffusion-based Clothing Designer via Text-guided Input
    Lampe, Ajda
    Stopar, Julija
    Jain, Deepak K.
    Omachi, Shinichiro
    Peer, Peter
    Struc, Vitomir
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [50] Medical image segmentation based on the diffusion equation and MRF model
    Li, Yibing
    Zhu, Yao
    Ye, Fang
    Journal of Information and Computational Science, 2014, 11 (05): : 1471 - 1478