Text-Guided Synthesis of Eulerian Cinemagraphs

被引:0
|
作者
Mahapatra, Aniruddha [1 ]
Siarohin, Aliaksandr [2 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Zhu, Jun-Yan [1 ]
机构
[1] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Snap Inc, 2850 Ocean Pk Blvd, Santa Monica, CA 90405 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 06期
关键词
Cinemagraphs; Diffusion Models; Generative Adversarial Networks; IMAGE;
D O I
10.1145/3618326
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. We focus on cinemagraphs of fluid elements, such as flowing rivers, and drifting clouds, which exhibit continuous motion and repetitive textures. Existing singleimage animation methods fall short on artistic inputs, and recent text-based video methods frequently introduce temporal inconsistencies, struggling to keep certain regions static. To address these challenges, we propose an idea of synthesizing image twins from a single text prompt - a pair of an artistic image and its pixel-aligned corresponding natural-looking twin. While the artistic image depicts the style and appearance detailed in our text prompt, the realistic counterpart greatly simplifies layout and motion analysis. Leveraging existing natural image and video datasets, we can accurately segment the realistic image and predict plausible motion given the semantic information. The predicted motion can then be transferred to the artistic image to create the final cinemagraph. Our method outperforms existing approaches in creating cinemagraphs for natural landscapes as well as artistic and other-worldly scenes, as validated by automated metrics and user studies. Finally, we demonstrate two extensions: animating existing paintings and controlling motion directions using text.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] LivePhoto: Real Image Animation with Text-Guided Motion Control
    Chen, Xi
    Liu, Zhiheng
    Chen, Mengting
    Feng, Yutong
    Liu, Yu
    Shen, Yujun
    Zhao, Hengshuang
    COMPUTER VISION-ECCV 2024, PT XVIII, 2025, 15076 : 475 - 491
  • [42] Advances in text-guided 3D editing: a survey
    Lu, Lihua
    Li, Ruyang
    Zhang, Xiaohui
    Wei, Hui
    Du, Guoguang
    Wang, Binqiang
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
  • [43] Dilated Residual Aggregation Network for Text-Guided Image Manipulation
    Lu, Siwei
    Luo, Di
    Yang, Zhenguo
    Hao, Tianyong
    Li, Qing
    Liu, Wenyin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 28 - 40
  • [44] TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
    Xia, Weihao
    Yang, Yujiu
    Xue, Jing-Hao
    Wu, Baoyuan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2256 - 2265
  • [45] Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
    Li, Bowen
    Qi, Xiaojuan
    Torr, Philip H. S.
    Lukasiewicz, Thomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [46] Learning Universal Policies via Text-Guided Video Generation
    Du, Yilun
    Yang, Mengjiao
    Dai, Bo
    Dai, Hanjun
    Nachum, Ofir
    Tenenbaum, Joshua B.
    Schuurmans, Dale
    Abbeel, Pieter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] Text-guided Fourier Augmentation for long-tailed recognition
    Wang, Weiqiu
    Chen, Zining
    Su, Fei
    Zhao, Zhicheng
    PATTERN RECOGNITION LETTERS, 2024, 179 : 38 - 44
  • [48] Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
    Le, Matthew
    Vyas, Apoorv
    Shi, Bowen
    Karrer, Brian
    Sari, Leda
    Moritz, Rashel
    Williamson, Mary
    Manohar, Vimal
    Adi, Yossi
    Mahadeokar, Jay
    Hsu, Wei-Ning
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Text-Guided Mask-Free Local Image Retouching
    Liu, Zerun
    Zhang, Fan
    He, Jingxuan
    Wang, Jin
    Wang, Zhangye
    Cheng, Lechao
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2783 - 2788
  • [50] TEXTure: Text-Guided Texturing of 3D Shapes
    Richardson, Elad
    Metzer, Gal
    Alaluf, Yuval
    Giryes, Raja
    Cohen-Or, Daniel
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,