Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation

被引：119

作者：

Chu, Mengyu ^{[1
]}

Xie, You ^{[1
]}

Mayer, Jonas ^{[1
]}

Leal-Taix, Laura ^{[1
]}

Thuerey, Nils ^{[1
]}

机构：

[1] Tech Univ Munich, Dept Comp Sci, Munich, Germany

来源：

ACM TRANSACTIONS ON GRAPHICS | 2020年 / 39卷 / 04期

关键词：

Generative adversarial network; temporal cycle-consistency; self-supervision; video super-resolution; unpaired video translation; MOTION;

D O I：

10.1145/3386569.3392457

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Our work explores temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationships in the generated data are much less explored. Natural temporal changes are crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as L-2 over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. Additionally, we propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics. Code, data, models, and results are provided at https://github.com/thunil/TecoGAN.

引用

页数：13

共 50 条

[1] Time Is MattEr: Temporal Self-supervision for Video Transformers
Yun, Sukmin
Kim, Jaehyung
Han, Dongyoon
Song, Hwanjun
Ha, Jung-Woo
Shin, Jinwoo
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[2] Learning to Remove Rain in Video With Self-Supervision
Yang, Wenhan
Tan, Robby T.
Wang, Shiqi
Kot, Alex C.
Liu, Jiaying
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1378 - 1396
[3] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
Lianggangxu Chen
Yiqing Cai
Changhong Lu
Changbo Wang
Gaoqi He
Multimedia Tools and Applications, 2023, 82 : 38947 - 38966
[4] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
Chen, Lianggangxu
Cai, Yiqing
Lu, Changhong
Wang, Changbo
He, Gaoqi
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 38947 - 38966
[5] Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Yuan, Liangzhe
Qian, Rui
Cui, Yin
Gong, Boqing
Schroff, Florian
Yang, Ming-Hsuan
Adam, Hartwig
Liu, Ting
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13957 - 13966
[6] Audio-Visual Contrastive Learning with Temporal Self-Supervision
Jenni, Simon
Black, Alexander
Collomosse, John
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 7996 - 8004
[7] Tailoring Self-Supervision for Supervised Learning
Moon, WonJun
Kim, Ji-Hwan
Heo, Jae-Pil
COMPUTER VISION, ECCV 2022, PT XXV, 2022, 13685 : 346 - 364
[8] Learning with self-supervision on EEG data
Gramfort, Alexandre
Banville, Hubert
Chehab, Omar
Hyvarinen, Aapo
Engemann, Denis
2021 9TH IEEE INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2021, : 28 - 29
[9] PITCH ESTIMATION VIA SELF-SUPERVISION
Gfeller, Beat
Frank, Christian
Roblek, Dominik
Sharifi, Matt
Tagliasacchi, Marco
Velimirovic, Mihajlo
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3527 - 3531
[10] Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling
Jwalapuram, Prathyusha
Joty, Shafiq
Lin, Xiang
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6044 - 6059

← 1 2 3 4 5 →