Text-Guided Synthesis of Eulerian Cinemagraphs

被引:0
|
作者
Mahapatra, Aniruddha [1 ]
Siarohin, Aliaksandr [2 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Zhu, Jun-Yan [1 ]
机构
[1] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Snap Inc, 2850 Ocean Pk Blvd, Santa Monica, CA 90405 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 06期
关键词
Cinemagraphs; Diffusion Models; Generative Adversarial Networks; IMAGE;
D O I
10.1145/3618326
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. We focus on cinemagraphs of fluid elements, such as flowing rivers, and drifting clouds, which exhibit continuous motion and repetitive textures. Existing singleimage animation methods fall short on artistic inputs, and recent text-based video methods frequently introduce temporal inconsistencies, struggling to keep certain regions static. To address these challenges, we propose an idea of synthesizing image twins from a single text prompt - a pair of an artistic image and its pixel-aligned corresponding natural-looking twin. While the artistic image depicts the style and appearance detailed in our text prompt, the realistic counterpart greatly simplifies layout and motion analysis. Leveraging existing natural image and video datasets, we can accurately segment the realistic image and predict plausible motion given the semantic information. The predicted motion can then be transferred to the artistic image to create the final cinemagraph. Our method outperforms existing approaches in creating cinemagraphs for natural landscapes as well as artistic and other-worldly scenes, as validated by automated metrics and user studies. Finally, we demonstrate two extensions: animating existing paintings and controlling motion directions using text.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Target-Free Text-Guided Image Manipulation
    Fan, Wan-Cyuan
    Yang, Cheng-Fu
    Yang, Chiao-An
    Wang, Yu-Chiang Frank
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 588 - 596
  • [32] SEGMENTATION-AWARE TEXT-GUIDED IMAGE MANIPULATION
    Haruyama, Tomoki
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2433 - 2437
  • [33] Prior-free Guided TTS: An Improved and Efficient Diffusion-based Text-Guided Speech Synthesis
    Choi, Won-Gook
    Kim, So-Jeong
    Kim, Taeho
    Chang, Joon-Hyuk
    INTERSPEECH 2023, 2023, : 4289 - 4293
  • [34] Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification
    Tang, Haojun
    Zhao, Wenda
    Hu, Guang
    Xiao, Yi
    Li, Yunlong
    Wang, Haipeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] FusionDeformer: text-guided mesh deformation using diffusion models
    Xu, Hao
    Wu, Yiqian
    Tang, Xiangjun
    Zhang, Jing
    Zhang, Yang
    Zhang, Zhebin
    Li, Chen
    Jin, Xiaogang
    VISUAL COMPUTER, 2024, 40 (07): : 4701 - 4712
  • [36] Text-Guided Visual Feature Refinement for Text-Based Person Search
    Gao, Liying
    Niu, Kai
    Ma, Zehong
    Jiao, Bingliang
    Tan, Tonghao
    Wang, Peng
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 118 - 126
  • [37] MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting
    Lin, Qing
    Yan, Bo
    Li, Jichun
    Tan, Weimin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1094 - 1102
  • [38] Adversarial Learning with Mask Reconstruction for Text-Guided Image Inpainting
    Wu, Xingcai
    Xie, Yucheng
    Zeng, Jiaqi
    Yang, Zhenguo
    Yu, Yi
    Li, Qing
    Liu, Wenyin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 3464 - 3472
  • [39] Rethinking Super-Resolution as Text-Guided Details Generation
    Ma, Chenxi
    Yan, Bo
    Lin, Qing
    Tan, Weimin
    Chen, Siming
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3461 - 3469
  • [40] Text-guided distillation learning to diversify video embeddings for text-video retrieval
    Lee, Sangmin
    Kim, Hyung-Il
    Ro, Yong Man
    PATTERN RECOGNITION, 2024, 156