Structure Preserving Video Prediction

被引:16
|
作者
Xu, Jingwei [1 ]
Ni, Bingbing [1 ]
Li, Zefan [1 ]
Cheng, Shuo [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Key Lab Digital Media Proc & Transmiss, Shanghai Inst Adv Commun & Data Sci, Shanghai 200240, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2018.00158
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite recent emergence of adversarial based methods for video prediction, existing algorithms often produce unsatisfied results in image regions with rich structural information (i.e., object boundary) and detailed motion (i.e., articulated body movement). To this end, we present a structure preserving video prediction framework to explicitly address above issues and enhance video prediction quality. On one hand, our framework contains a two-stream generation architecture which deals with high frequency video content (i.e., detailed object or articulated motion structure) and low frequency video content (i.e., location or moving directions) in two separate streams. On the other hand, we propose a RNN structure for video prediction, which employs temporal-adaptive convolutional kernels to capture time-varying motion patterns as well as tiny objects within a scene. Extensive experiments on diverse scenes, ranging from human motion to semantic layout prediction, demonstrate the effectiveness of the proposed video prediction approach.
引用
收藏
页码:1460 / 1469
页数:10
相关论文
共 50 条
  • [1] Cascaded UNet for progressive noise residual prediction for structure-preserving video denoising
    Pimpale, Abhijeet
    Bhurchandi, Kishor
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 248
  • [2] Group Structure Preserving Pedestrian Tracking in a Multicamera Video Network
    Jin, Zhixing
    An, Le
    Bhanu, Bir
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (10) : 2165 - 2176
  • [3] Structure-Preserving Motion Estimation for Learned Video Compression
    Gao, Han
    Cui, Jinzhong
    Ye, Mao
    Li, Shuai
    Zhao, Yu
    Zhu, Xiatian
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3055 - 3063
  • [4] IPRNN: AN INFORMATION-PRESERVING MODEL FOR VIDEO PREDICTION USING SPATIOTEMPORAL GRUS
    Chang, Zheng
    Zhang, Xinfeng
    Wang, Shanshe
    Ma, Siwei
    Gao, Wen
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2703 - 2707
  • [5] POSE GUIDED GLOBAL AND LOCAL GAN FOR APPEARANCE PRESERVING HUMAN VIDEO PREDICTION
    Tang, Jilin
    Hu, Haoji
    Zhou, Qiang
    Shan, Hangguan
    Tian, Chuan
    Quek, Tony Q. S.
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 614 - 618
  • [6] A NEW PREDICTION STRUCTURE FOR MULTIVIEW VIDEO CODING
    Pourazad, M. T.
    Nasiopoulos, P.
    Ward, R. K.
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 267 - 271
  • [7] Scalable Prediction Structure for Multiview Video Coding
    Huo, Junyan
    Chang, Yilin
    Li, Ming
    Ma, Yanzhuo
    ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 2593 - 2596
  • [8] Structure-Adaptive Neighborhood Preserving Hashing for Scalable Video Search
    Li, Shuyan
    Li, Xiu
    Lu, Jiwen
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2441 - 2454
  • [9] Investigating the Structure Preserving Encryption of High Efficiency Video Coding (HEVC)
    Shahid, Zafar
    Puech, William
    REAL-TIME IMAGE AND VIDEO PROCESSING 2013, 2013, 8656
  • [10] Neighbourhood Structure Preserving Cross-Modal Embedding for Video Hyperlinking
    Hao, Yanbin
    Ngo, Chong-Wah
    Huet, Benoit
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 188 - 200