Flow-Guided Transformer for Video Inpainting

被引:36
|
作者
Zhang, Kaidong [1 ]
Fu, Jingjing [2 ]
Liu, Dong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
来源
关键词
Video inpainting; Optical flow; Transformer; OBJECT REMOVAL; IMAGE;
D O I
10.1007/978-3-031-19797-0_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a flow-guided transformer, which innovatively leverage the motion discrepancy exposed by optical flows to instruct the attention retrieval in transformer for high fidelity video inpainting. More specially, we design a novel flow completion network to complete the corrupted flows by exploiting the relevant flow features in a local temporal window. With the completed flows, we propagate the content across video frames, and adopt the flow-guided transformer to synthesize the rest corrupted regions. We decouple transformers along temporal and spatial dimension, so that we can easily integrate the locally relevant completed flows to instruct spatial attention only. Furthermore, we design a flow-reweight module to precisely control the impact of completed flows on each spatial transformer. For the sake of efficiency, we introduce window partition strategy to both spatial and temporal transformers. Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention. Extensive experiments demonstrate the effectiveness of the proposed method qualitatively and quantitatively. Codes are available at https://github.com/hitachinsk/FGT.
引用
收藏
页码:74 / 90
页数:17
相关论文
共 50 条
  • [31] Recurrent Flow-Guided Semantic Forecasting
    Terwilliger, Adam M.
    Brazil, Garrick
    Liu, Xiaoming
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1703 - 1712
  • [32] A flow-guided streamline seeding strategy
    Verma, V
    Kao, D
    Pang, A
    VISUALIZATION 2000, PROCEEDINGS, 2000, : 163 - 170
  • [33] FLOW-GUIDED CATHETERIZATION - PERFORMANCE, COMPLICATIONS
    KRESS, P
    SEIBOLD, H
    WIESHAMMER, S
    SCHMIDT, A
    HAERER, W
    AHNEFELD, FW
    STAUCH, M
    HOMBACH, V
    HERZ KREISLAUF, 1988, 20 (12): : 511 - 517
  • [34] Flow-Guided Temporal-Spatial Network for HEVC Compressed Video Quality Enhancement
    Meng, Xiandong
    Deng, Xuan
    Zhu, Shuyuan
    Liu, Shuaicheng
    Zeng, Bing
    2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 384 - 384
  • [35] FLOW-GUIDED DEFORMABLE ATTENTION NETWORK FOR FAST ONLINE VIDEO SUPER-RESOLUTION
    Yang, Xi
    Zhang, Xindong
    Zhang, Lei
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 390 - 394
  • [36] ProPainter: Improving Propagation and Transformer for Video Inpainting
    Zhou, Shangchen
    Li, Chongyi
    Chan, Kelvin C. K.
    Loy, Chen Change
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10443 - 10452
  • [37] DLFormer: Discrete Latent Transformer for Video Inpainting
    Ren, Jingjing
    Zheng, Qingqing
    Zhao, Yuanyuan
    Xu, Xuemiao
    Li, Chen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3501 - 3510
  • [38] Edge-Guided Image Inpainting with Transformer
    Liang, Huining
    Kambhamettu, Chandra
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT II, 2023, 14362 : 285 - 296
  • [39] BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment
    Luo, Ziwei
    Li, Youwei
    Cheng, Shen
    Yu, Lei
    Wu, Qi
    Wen, Zhihong
    Fan, Haoqiang
    Sun, Jian
    Liu, Shuaicheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 997 - 1007
  • [40] DLFormer: Discrete Latent Transformer for Video Inpainting
    School of Computer Science and Engineering, South China University of Technology, China
    不详
    不详
    Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, 1600, (3501-3510):