FSTT: Flow-Guided Spatial Temporal Transformer for Deep Video Inpainting

被引：5

作者：

Liu, Ruixin ^{[1
]}

Zhu, Yuesheng ^{[1
]}

机构：

[1] Peking Univ, Shenzhen Grad Sch, Commun & Informat Secur Lab, Shenzhen 518055, Peoples R China

来源：

ELECTRONICS | 2023年 / 12卷 / 21期

基金：

中国国家自然科学基金;

关键词：

deep video inpainting; video editing; spatial temporal transformer; optical flow; object removal;

D O I：

10.3390/electronics12214452

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video inpainting aims to complete the missing regions with content that is consistent both spatially and temporally. How to effectively utilize the spatio-temporal information in videos is critical for video inpainting. Recent advances in video inpainting methods combine both optical flow and transformers to capture spatio-temporal information. However, these methods fail to fully explore the potential of optical flow within the transformer. Furthermore, the designed transformer block cannot effectively integrate spatio-temporal information across frames. To address the above problems, we propose a novel video inpainting model, named Flow-Guided Spatial Temporal Transformer (FSTT), which effectively establishes correspondences between missing regions and valid regions in both spatial and temporal dimensions under the guidance of completed optical flow. Specifically, a Flow-Guided Fusion Feed-Forward module is developed to enhance features with the assistance of optical flow, mitigating the inaccuracies caused by hole pixels when performing MHSA. Additionally, a decomposed spatio-temporal MHSA module is proposed to effectively capture spatio-temporal dependencies in videos. To improve the efficiency of the model, a Global-Local Temporal MHSA module is further designed based on the window partition strategy. Extensive quantitative and qualitative experiments on the DAVIS and YouTube-VOS datasets demonstrate the superiority of our proposed method.

引用

页数：20

共 50 条

[31] Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization
Kim, Taewan
Kim, Jinwoo
Oh, Heeseok
Kang, Jiwoo
IEEE ACCESS, 2024, 12 : 21723 - 21736
[32] Temporal-Spatial Generative Adversarial Networks for Video Inpainting
Yu B.
Ding Y.
Xie Z.
Huang D.
Ma L.
Xie, Zhifeng (zhifeng_xie@shu.edu.cn), 1600, Institute of Computing Technology (32): : 769 - 779
[33] Video Inpainting by Jointly Learning Temporal Structure and Spatial Details
Wang, Chuan
Huang, Haibin
Han, Xiaoguang
Wang, Jue
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5232 - 5239
[34] Flow-Guided Single Object Tracking Framework In UAV Aerial Video
Zhu, Wenjun
Yu, Xi
Meng, Jun
PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 461 - 468
[35] Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Zhang, Kaidong
Peng, Jialun
Fu, Jingjing
Liu, Dong
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (07) : 4977 - 4992
[36] Video deblurring and flow-guided feature aggregation for obstacle detection in agricultural videos
Keyang Cheng
Xuesen Zhu
Yongzhao Zhan
Yunshen Pei
International Journal of Multimedia Information Retrieval, 2022, 11 : 577 - 588
[37] Video deblurring and flow-guided feature aggregation for obstacle detection in agricultural videos
Cheng, Keyang
Zhu, Xuesen
Zhan, Yongzhao
Pei, Yunshen
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (04) : 577 - 588
[38] SpecReFlow: an algorithm for specular reflection restoration using flow-guided video completion
Yin, Haoli
Eimen, Rachel
Moyer, Daniel
Bowden, Audrey K.
JOURNAL OF MEDICAL IMAGING, 2024, 11 (02)
[39] Inertia-Guided Flow Completion and Style Fusion for Video Inpainting
Zhang, Kaidong
Fu, Jingjing
Liu, Dong
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June : 5972 - 5981
[40] Inertia-Guided Flow Completion and Style Fusion for Video Inpainting
Zhang, Kaidong
Fu, Jingjing
Liu, Dong
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5972 - 5981

← 1 2 3 4 5 →