Structure Preserving Video Prediction

被引：16

作者：

Xu, Jingwei ^{[1
]}

Ni, Bingbing ^{[1
]}

Li, Zefan ^{[1
]}

Cheng, Shuo ^{[1
]}

Yang, Xiaokang ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai Key Lab Digital Media Proc & Transmiss, Shanghai Inst Adv Commun & Data Sci, Shanghai 200240, Peoples R China

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR.2018.00158

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Despite recent emergence of adversarial based methods for video prediction, existing algorithms often produce unsatisfied results in image regions with rich structural information (i.e., object boundary) and detailed motion (i.e., articulated body movement). To this end, we present a structure preserving video prediction framework to explicitly address above issues and enhance video prediction quality. On one hand, our framework contains a two-stream generation architecture which deals with high frequency video content (i.e., detailed object or articulated motion structure) and low frequency video content (i.e., location or moving directions) in two separate streams. On the other hand, we propose a RNN structure for video prediction, which employs temporal-adaptive convolutional kernels to capture time-varying motion patterns as well as tiny objects within a scene. Extensive experiments on diverse scenes, ranging from human motion to semantic layout prediction, demonstrate the effectiveness of the proposed video prediction approach.

引用

页码：1460 / 1469

页数：10

共 50 条

[1] Cascaded UNet for progressive noise residual prediction for structure-preserving video denoising
Pimpale, Abhijeet
Bhurchandi, Kishor
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 248
[2] Group Structure Preserving Pedestrian Tracking in a Multicamera Video Network
Jin, Zhixing
An, Le
Bhanu, Bir
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (10) : 2165 - 2176
[3] Structure-Preserving Motion Estimation for Learned Video Compression
Gao, Han
Cui, Jinzhong
Ye, Mao
Li, Shuai
Zhao, Yu
Zhu, Xiatian
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3055 - 3063
[4] IPRNN: AN INFORMATION-PRESERVING MODEL FOR VIDEO PREDICTION USING SPATIOTEMPORAL GRUS
Chang, Zheng
Zhang, Xinfeng
Wang, Shanshe
Ma, Siwei
Gao, Wen
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2703 - 2707
[5] POSE GUIDED GLOBAL AND LOCAL GAN FOR APPEARANCE PRESERVING HUMAN VIDEO PREDICTION
Tang, Jilin
Hu, Haoji
Zhou, Qiang
Shan, Hangguan
Tian, Chuan
Quek, Tony Q. S.
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 614 - 618
[6] A NEW PREDICTION STRUCTURE FOR MULTIVIEW VIDEO CODING
Pourazad, M. T.
Nasiopoulos, P.
Ward, R. K.
2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 267 - 271
[7] Scalable Prediction Structure for Multiview Video Coding
Huo, Junyan
Chang, Yilin
Li, Ming
Ma, Yanzhuo
ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 2593 - 2596
[8] Structure-Adaptive Neighborhood Preserving Hashing for Scalable Video Search
Li, Shuyan
Li, Xiu
Lu, Jiwen
Zhou, Jie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2441 - 2454
[9] Investigating the Structure Preserving Encryption of High Efficiency Video Coding (HEVC)
Shahid, Zafar
Puech, William
REAL-TIME IMAGE AND VIDEO PROCESSING 2013, 2013, 8656
[10] Neighbourhood Structure Preserving Cross-Modal Embedding for Video Hyperlinking
Hao, Yanbin
Ngo, Chong-Wah
Huet, Benoit
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 188 - 200

← 1 2 3 4 5 →