FSTT: Flow-Guided Spatial Temporal Transformer for Deep Video Inpainting

被引:5
|
作者
Liu, Ruixin [1 ]
Zhu, Yuesheng [1 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Commun & Informat Secur Lab, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
deep video inpainting; video editing; spatial temporal transformer; optical flow; object removal;
D O I
10.3390/electronics12214452
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video inpainting aims to complete the missing regions with content that is consistent both spatially and temporally. How to effectively utilize the spatio-temporal information in videos is critical for video inpainting. Recent advances in video inpainting methods combine both optical flow and transformers to capture spatio-temporal information. However, these methods fail to fully explore the potential of optical flow within the transformer. Furthermore, the designed transformer block cannot effectively integrate spatio-temporal information across frames. To address the above problems, we propose a novel video inpainting model, named Flow-Guided Spatial Temporal Transformer (FSTT), which effectively establishes correspondences between missing regions and valid regions in both spatial and temporal dimensions under the guidance of completed optical flow. Specifically, a Flow-Guided Fusion Feed-Forward module is developed to enhance features with the assistance of optical flow, mitigating the inaccuracies caused by hole pixels when performing MHSA. Additionally, a decomposed spatio-temporal MHSA module is proposed to effectively capture spatio-temporal dependencies in videos. To improve the efficiency of the model, a Global-Local Temporal MHSA module is further designed based on the window partition strategy. Extensive quantitative and qualitative experiments on the DAVIS and YouTube-VOS datasets demonstrate the superiority of our proposed method.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Flow-Guided Transformer for Video Inpainting
    Zhang, Kaidong
    Fu, Jingjing
    Liu, Dong
    COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 : 74 - 90
  • [2] Deep Flow-Guided Video Inpainting
    Xu, Rui
    Li, Xiaoxiao
    Zhou, Bolei
    Loy, Chen Change
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3718 - 3727
  • [3] Local and nonlocal flow-guided video inpainting
    Jing Wang
    Zongju Yang
    Zhanqiang Huo
    Wei Chen
    Multimedia Tools and Applications, 2024, 83 : 10321 - 10340
  • [4] Local and nonlocal flow-guided video inpainting
    Wang, Jing
    Yang, Zongju
    Huo, Zhanqiang
    Chen, Wei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 10321 - 10340
  • [5] Flow-Guided Video Inpainting with Scene Templates
    Lao, Dong
    Zhu, Peihao
    Wonka, Peter
    Sundaramoorthi, Ganesh
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14579 - 14588
  • [6] Flow-Guided Transformer for Video Colorization
    Zhai, Yan
    Tao, Zhulin
    Dai, Longquan
    Wang, He
    Huang, Xianglin
    Yang, Lifang
    Proceedings - International Conference on Image Processing, ICIP, 2023, : 2485 - 2489
  • [7] FLOW-GUIDED TRANSFORMER FOR VIDEO COLORIZATION
    Zhai, Yan
    Tao, Zhulin
    Dai, Longquan
    Wang, He
    Huang, Xianglin
    Yang, Lifang
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2485 - 2489
  • [8] FVIFormer: Flow-Guided Global-Local Aggregation Transformer Network for Video Inpainting
    Yan, Weiqing
    Sun, Yiqiu
    Yue, Guanghui
    Zhou, Wei
    Liu, Hantao
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 235 - 244
  • [9] Error Compensation Framework for Flow-Guided Video Inpainting
    Kang, Jaeyeon
    Oh, Seoung Wug
    Kim, Seon Joo
    COMPUTER VISION - ECCV 2022, PT XV, 2022, 13675 : 375 - 390
  • [10] Flow-Guided Sparse Transformer for Video Deblurring
    Lin, Jing
    Cai, Yuanhao
    Hu, Xiaowan
    Wang, Haoqian
    Yan, Youliang
    Zou, Xueyi
    Ding, Henghui
    Zhang, Yulun
    Timofte, Radu
    Van Gool, Luc
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,