Spatio-Temporal Inference Transformer Network for Video Inpainting

被引:0
|
作者
Tudavekar, Gajanan [1 ,2 ]
Saraf, Santosh S. [2 ]
Patil, Sanjay R. [3 ]
机构
[1] Angadi Inst Technol & Management, Dept Elect & Commun Engn, Belagavi 590009, Karnataka, India
[2] KLS Gogte Inst Technol, Dept Elect & Commun Engn, Belagavi 590008, Karnataka, India
[3] Dnyanshree Inst Engn & Technol, Dept Elect & Telecommun Engn, Sajjangad Rd, Satara 415013, Maharashtra, India
关键词
Image inpainting; video inpainting; Transformer Network; deep learning; IMAGE QUALITY ASSESSMENT; OBJECT REMOVAL;
D O I
10.1142/S0219467823500079
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Video inpainting aims to complete in a visually pleasing way the missing regions in video frames. Video inpainting is an exciting task due to the variety of motions across different frames. The existing methods usually use attention models to inpaint videos by seeking the damaged content from other frames. Nevertheless, these methods suffer due to irregular attention weight from spatio-temporal dimensions, thus giving rise to artifacts in the inpainted video. To overcome the above problem, Spatio-Temporal Inference Transformer Network (STITN) has been proposed. The STITN aligns the frames to be inpainted and concurrently inpaints all the frames, and a spatio-temporal adversarial loss function improves the STITN. Our method performs considerably better than the existing deep learning approaches in quantitative and qualitative evaluation.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A novel spatio-temporal memory network for video anomaly detection
    Li H.
    Chen M.
    Multimedia Tools and Applications, 2025, 84 (8) : 4603 - 4624
  • [32] Unsupervised Video Prediction Network with Spatio-temporal Deep Features
    Jin, Beibei
    Zhou, Rong
    Zhang, Zhisheng
    Dai, Min
    PROCEEDINGS OF THE 2018 25TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE (M2VIP), 2018, : 19 - 24
  • [33] Spatio-Temporal Fusion Network for Video Super-Resolution
    Li, Huabin
    Zhang, Pingjian
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [34] Spatio-temporal Prompting Network for Robust Video Feature Extraction
    Sun, Guanxiong
    Wang, Chi
    Zhang, Zhaoyu
    Deng, Jiankang
    Zafeiriou, Stefanos
    Hua, Yang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13541 - 13551
  • [35] MULTISCALE SPATIO-TEMPORAL NETWORK FOR AERIAL VIDEO EVENT RECOGNITION
    Yang, Feng
    Zhang, Jian
    Zhao, Yue
    Qin, Anyong
    Gao, Chenqiang
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 7835 - 7838
  • [36] Dynamic Spatio-Temporal Modular Network for Video Question Answering
    Qian, Zi
    Wang, Xin
    Duan, Xuguang
    Chen, Hong
    Zhu, Wenwu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4466 - 4477
  • [37] Human Motion Prediction via Spatio-Temporal Inpainting
    Ruiz, A. Hernandez
    Gall, J.
    Moreno-Noguer, F.
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7133 - 7142
  • [38] Spatio-Temporal Catcher: a Self-Supervised Transformer for Deepfake Video Detection
    Li, Maosen
    Li, Xurong
    Yu, Kun
    Deng, Cheng
    Huang, Heng
    Mao, Feng
    Xue, Hui
    Li, Minghao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8707 - 8718
  • [39] Interpolating Deep Spatio-Temporal Inference Network Features for Image Classification
    Zhang, Yongfeng
    Shang, Changjing
    Shen, Qiang
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1819 - 1826
  • [40] Video Segmentation with Spatio-Temporal Tubes
    Trichet, Remi
    Nevatia, Ramakant
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 330 - 335