Text-Guided Portrait Image Matting

被引:0
|
作者
Xu Y. [1 ]
Yao X. [1 ]
Liu B. [1 ]
Quan Y. [1 ]
Ji H. [2 ]
机构
[1] School of Computer Science and Engineering, South China University of Technology, Guangzhou
[2] Department of Mathematics, National University of Singapore
来源
关键词
Annotations; Artificial intelligence; Artificial neural networks; Attention; Batch production systems; Cross-modal Learning; Data mining; Feature extraction; Image Matting; Text Gudiance; Training;
D O I
10.1109/TAI.2024.3363120
中图分类号
学科分类号
摘要
Image matting is a technique used to separate the foreground of an image from the background, which estimates an alpha matte that indicates pixel-wise degree of transparency. To precisely extract target objects and address the ambiguity of solutions in image matting, many existing approaches employ a trimap or background image provided by the user as additional input to guide the matting process. This paper introduces a novel matting paradigm termed text-guided image matting, utilizing a textual description of the foreground object as a guiding element. In contrast to trimap or background-based methods, text-guided matting offers a user-friendly interface, providing semantic clues for the objects of interest. Moreover, it facilitates batch processing across multiple frames featuring the same objects of interest. The proposed text-guided matting approach is implemented through a deep neural network comprising three-stage cross-modal feature fusion and two-step alpha matte prediction. Experimental results on portrait matting demonstrate the competitive performance of our text-guided approach compared to existing trimap-based and background-based methods. IEEE
引用
收藏
页码:1 / 13
页数:12
相关论文
共 50 条
  • [41] Text-guided visual representation learning for medical image retrieval systems
    Serieys, Guillaume
    Kurtz, Camille
    Fournier, Laure
    Cloppet, Florence
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 593 - 598
  • [42] Prompt Augmentation for Self-supervised Text-guided Image Manipulation
    Bodur, Rumeysa
    Bhattarai, Binod
    Kim, Tae-Kyun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 8829 - 8838
  • [43] Perceptual Image Compression with Text-Guided Multi-level Fusion
    Hu, Jiaqi
    Zhuang, Jiedong
    Liang, Xiaoyu
    Wang, Dayong
    Yu, Lu
    Hu, Haoji
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 84 - 97
  • [44] Improving Cross-modal Alignment for Text-Guided Image Inpainting
    Zhou, Yucheng
    Long, Guodong
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3445 - 3456
  • [45] CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics
    Song, Yiren
    Shao, Xuning
    Chen, Kang
    Zhang, Weidong
    Jing, Zhongliang
    Li, Minzhe
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2312 - 2320
  • [46] GENERATIVE ADVERSARIAL NETWORK INCLUDING REFERRING IMAGE SEGMENTATION FOR TEXT-GUIDED IMAGE MANIPULATION
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4818 - 4822
  • [47] Benchmarking Robustness to Text-Guided Corruptions
    Mofayezi, Mohammadreza
    Medghalchi, Yasamin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 779 - 786
  • [48] Text-Guided Vector Graphics Customization
    Zhang, Peiying
    Zhao, Nanxuan
    Liao, Jing
    PROCEEDINGS OF THE SIGGRAPH ASIA 2023 CONFERENCE PAPERS, 2023,
  • [49] Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach
    Liu, Yahui
    De Nadai, Marco
    Cai, Deng
    Li, Huayang
    Alameda-Pineda, Xavier
    Sebe, Nicu
    Lepri, Bruno
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1357 - 1365
  • [50] Manipulation Direction: Evaluating Text-Guided Image Manipulation Based on Similarity between Changes in Image and Text Modalities
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    SENSORS, 2023, 23 (22)