Text-Guided Portrait Image Matting

被引:0
|
作者
Xu Y. [1 ]
Yao X. [1 ]
Liu B. [1 ]
Quan Y. [1 ]
Ji H. [2 ]
机构
[1] School of Computer Science and Engineering, South China University of Technology, Guangzhou
[2] Department of Mathematics, National University of Singapore
来源
关键词
Annotations; Artificial intelligence; Artificial neural networks; Attention; Batch production systems; Cross-modal Learning; Data mining; Feature extraction; Image Matting; Text Gudiance; Training;
D O I
10.1109/TAI.2024.3363120
中图分类号
学科分类号
摘要
Image matting is a technique used to separate the foreground of an image from the background, which estimates an alpha matte that indicates pixel-wise degree of transparency. To precisely extract target objects and address the ambiguity of solutions in image matting, many existing approaches employ a trimap or background image provided by the user as additional input to guide the matting process. This paper introduces a novel matting paradigm termed text-guided image matting, utilizing a textual description of the foreground object as a guiding element. In contrast to trimap or background-based methods, text-guided matting offers a user-friendly interface, providing semantic clues for the objects of interest. Moreover, it facilitates batch processing across multiple frames featuring the same objects of interest. The proposed text-guided matting approach is implemented through a deep neural network comprising three-stage cross-modal feature fusion and two-step alpha matte prediction. Experimental results on portrait matting demonstrate the competitive performance of our text-guided approach compared to existing trimap-based and background-based methods. IEEE
引用
收藏
页码:1 / 13
页数:12
相关论文
共 50 条
  • [31] TolerantGAN: Text-Guided Image Manipulation Tolerant to Real-World Image
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 150 - 159
  • [32] Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
    Wang, Su
    Saharia, Chitwan
    Montgomery, Ceslee
    Pont-Tuset, Jordi
    Noy, Shai
    Pellegrini, Stefano
    Onoe, Yasumasa
    Laszlo, Sarah
    Fleet, David J.
    Soricut, Radu
    Baldridge, Jason
    Norouzi, Mohammad
    Anderson, Peter
    Chan, William
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18359 - 18369
  • [33] Text-guided Attention Mechanism Fine-grained Image Classification
    Yang, Xinglin
    Pan, Heng
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 45 - 49
  • [34] TIC: text-guided image colorization using conditional generative model
    Ghosh, Subhankar
    Roy, Prasun
    Bhattacharya, Saumik
    Pal, Umapada
    Blumenstein, Michael
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 41121 - 41136
  • [35] TIC: text-guided image colorization using conditional generative model
    Subhankar Ghosh
    Prasun Roy
    Saumik Bhattacharya
    Umapada Pal
    Michael Blumenstein
    Multimedia Tools and Applications, 2024, 83 : 41121 - 41136
  • [36] CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation
    Xu, Sihan
    Ma, Ziqiao
    Huang, Yidong
    Lee, Honglak
    Chai, Joyce
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [37] Eliminating the Cross-Domain Misalignment in Text-guided Image Inpainting
    Huang, Muqi
    Wang, Chaoyue
    Lu, Yong
    Zhang, Lefei
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 875 - 883
  • [38] Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
    Nam, Hyelin
    Kwon, Gihyun
    Park, Geon Yeong
    Ye, Jong Chul
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 9192 - 9201
  • [39] ABP: Asymmetric Bilateral Prompting for Text-Guided Medical Image Segmentation
    Zeng, Xinyi
    Zeng, Pinxian
    Cui, Jiaqi
    Li, Aibing
    Liu, Bo
    Wang, Chengdi
    Wang, Yan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IX, 2024, 15009 : 54 - 64
  • [40] MISL: Multi-grained image-text semantic learning for text-guided image inpainting
    Wu, Xingcai
    Zhao, Kejun
    Huang, Qianding
    Wang, Qi
    Yang, Zhenguo
    Hao, Gefei
    PATTERN RECOGNITION, 2024, 145