Text-Guided Synthesis of Eulerian Cinemagraphs

被引：0

作者：

Mahapatra, Aniruddha ^{[1
]}

Siarohin, Aliaksandr ^{[2
]}

Lee, Hsin-Ying ^{[2
]}

Tulyakov, Sergey ^{[2
]}

Zhu, Jun-Yan ^{[1
]}

机构：

[1] Carnegie Mellon Univ, 5000 Forbes Ave, Pittsburgh, PA 15213 USA

[2] Snap Inc, 2850 Ocean Pk Blvd, Santa Monica, CA 90405 USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2023年 / 42卷 / 06期

关键词：

Cinemagraphs; Diffusion Models; Generative Adversarial Networks; IMAGE;

D O I：

10.1145/3618326

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. We focus on cinemagraphs of fluid elements, such as flowing rivers, and drifting clouds, which exhibit continuous motion and repetitive textures. Existing singleimage animation methods fall short on artistic inputs, and recent text-based video methods frequently introduce temporal inconsistencies, struggling to keep certain regions static. To address these challenges, we propose an idea of synthesizing image twins from a single text prompt - a pair of an artistic image and its pixel-aligned corresponding natural-looking twin. While the artistic image depicts the style and appearance detailed in our text prompt, the realistic counterpart greatly simplifies layout and motion analysis. Leveraging existing natural image and video datasets, we can accurately segment the realistic image and predict plausible motion given the semantic information. The predicted motion can then be transferred to the artistic image to create the final cinemagraph. Our method outperforms existing approaches in creating cinemagraphs for natural landscapes as well as artistic and other-worldly scenes, as validated by automated metrics and user studies. Finally, we demonstrate two extensions: animating existing paintings and controlling motion directions using text.

引用

页数：13

共 50 条

[41] LivePhoto: Real Image Animation with Text-Guided Motion Control
Chen, Xi
Liu, Zhiheng
Chen, Mengting
Feng, Yutong
Liu, Yu
Shen, Yujun
Zhao, Hengshuang
COMPUTER VISION-ECCV 2024, PT XVIII, 2025, 15076 : 475 - 491
[42] Advances in text-guided 3D editing: a survey
Lu, Lihua
Li, Ruyang
Zhang, Xiaohui
Wei, Hui
Du, Guoguang
Wang, Binqiang
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
[43] Dilated Residual Aggregation Network for Text-Guided Image Manipulation
Lu, Siwei
Luo, Di
Yang, Zhenguo
Hao, Tianyong
Li, Qing
Liu, Wenyin
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 28 - 40
[44] TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
Xia, Weihao
Yang, Yujiu
Xue, Jing-Hao
Wu, Baoyuan
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2256 - 2265
[45] Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
Li, Bowen
Qi, Xiaojuan
Torr, Philip H. S.
Lukasiewicz, Thomas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[46] Learning Universal Policies via Text-Guided Video Generation
Du, Yilun
Yang, Mengjiao
Dai, Bo
Dai, Hanjun
Nachum, Ofir
Tenenbaum, Joshua B.
Schuurmans, Dale
Abbeel, Pieter
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[47] Text-guided Fourier Augmentation for long-tailed recognition
Wang, Weiqiu
Chen, Zining
Su, Fei
Zhao, Zhicheng
PATTERN RECOGNITION LETTERS, 2024, 179 : 38 - 44
[48] Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Le, Matthew
Vyas, Apoorv
Shi, Bowen
Karrer, Brian
Sari, Leda
Moritz, Rashel
Williamson, Mary
Manohar, Vimal
Adi, Yossi
Mahadeokar, Jay
Hsu, Wei-Ning
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[49] Text-Guided Mask-Free Local Image Retouching
Liu, Zerun
Zhang, Fan
He, Jingxuan
Wang, Jin
Wang, Zhangye
Cheng, Lechao
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2783 - 2788
[50] TEXTure: Text-Guided Texturing of 3D Shapes
Richardson, Elad
Metzer, Gal
Alaluf, Yuval
Giryes, Raja
Cohen-Or, Daniel
PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,

← 1 2 3 4 5 →