Text-Guided Synthesis of Eulerian Cinemagraphs

被引:0
|
作者
Mahapatra, Aniruddha [1 ]
Siarohin, Aliaksandr [2 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Zhu, Jun-Yan [1 ]
机构
[1] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Snap Inc, 2850 Ocean Pk Blvd, Santa Monica, CA 90405 USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 06期
关键词
Cinemagraphs; Diffusion Models; Generative Adversarial Networks; IMAGE;
D O I
10.1145/3618326
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. We focus on cinemagraphs of fluid elements, such as flowing rivers, and drifting clouds, which exhibit continuous motion and repetitive textures. Existing singleimage animation methods fall short on artistic inputs, and recent text-based video methods frequently introduce temporal inconsistencies, struggling to keep certain regions static. To address these challenges, we propose an idea of synthesizing image twins from a single text prompt - a pair of an artistic image and its pixel-aligned corresponding natural-looking twin. While the artistic image depicts the style and appearance detailed in our text prompt, the realistic counterpart greatly simplifies layout and motion analysis. Leveraging existing natural image and video datasets, we can accurately segment the realistic image and predict plausible motion given the semantic information. The predicted motion can then be transferred to the artistic image to create the final cinemagraph. Our method outperforms existing approaches in creating cinemagraphs for natural landscapes as well as artistic and other-worldly scenes, as validated by automated metrics and user studies. Finally, we demonstrate two extensions: animating existing paintings and controlling motion directions using text.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Text-Guided Legal Knowledge Graph Reasoning
    Li, Luoqiu
    Bi, Zhen
    Ye, Hongbin
    Deng, Shumin
    Chen, Hui
    Tou, Huaixiao
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: KNOWLEDGE GRAPH EMPOWERS NEW INFRASTRUCTURE CONSTRUCTION, 2021, 1466 : 27 - 39
  • [22] A Text-Guided Generation and Refinement Model for Image Captioning
    Wang, Depeng
    Hu, Zhenzhen
    Zhou, Yuanen
    Hong, Richang
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2966 - 2977
  • [23] Text-guided image-to-sketch diffusion models☆
    Ke, Aihua
    Huang, Yujie
    Cai, Bo
    Yang, Jie
    KNOWLEDGE-BASED SYSTEMS, 2024, 304
  • [24] Text-Guided Knowledge Transfer for Remote Sensing Image-Text Retrieval
    Liu, An-An
    Yang, Bo
    Li, Wenhui
    Song, Dan
    Sun, Zhengya
    Ren, Tongwei
    Wei, Zhiqiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [25] TeCH: Text-Guided Reconstruction of Lifelike Clothed Humans
    Huang, Yangyi
    Yi, Hongwei
    Xiu, Yuliang
    Liao, Tingting
    Tang, Jiaxiang
    Cai, Deng
    Thies, Justus
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1531 - 1542
  • [26] Hardware Resilience Properties of Text-Guided Image Classifiers
    Wasim, Syed Talal
    Soboka, Kabila Haile
    Mahmoud, Abdulrahman
    Khan, Salman
    Brooks, David
    Wei, Gu-Yeon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] Enhanced Text-Guided Attention Model for Image Captioning
    Zhou, Yuanen
    Hu, Zhenzhen
    Zhao, Ye
    Liu, Xueliang
    Hong, Richang
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [28] FocusGAN: Preserving Background in Text-Guided Image Editing
    Zhao, Liuqing
    Li, Linyan
    Hu, Fuyuan
    Xia, Zhenping
    Yao, Rui
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (16)
  • [29] TGANet: Text-Guided Attention for Improved Polyp Segmentation
    Tomar, Nikhil Kumar
    Jha, Debesh
    Bagci, Ulas
    Ali, Sharib
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT III, 2022, 13433 : 151 - 160
  • [30] Text-Guided Molecule Generation with Diffusion Language Model
    Gong, Haisong
    Liu, Qiang
    Wu, Shu
    Wang, Liang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 109 - 117