SwinVI:3D Swin Transformer Model with U-net for Video Inpainting

被引:0
|
作者
Zhang, Wei [1 ]
Cao, Yang [1 ]
Zhai, Junhai [1 ]
机构
[1] Hebei Univ, Coll Math & Informat Sci, Hebei Key Lab Machine Learning & Computat Intelli, Baoding, Peoples R China
关键词
Transformer; Video inpainting; Spatio-temporal;
D O I
10.1109/IJCNN54540.2023.10192024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of video inpainting is to fill in the local missingness of a given video as realistic as possible, it remains a challenging task, even with powerful deep learning methods. In recent years, Transformer has been introduced to video inpainting, and remarkable improvement has been achieved. However, it still suffers from the problems of generating blurry texture and requiring high computational cost. To address the two problems, we propose a new 3D Swin Transformer model (SwinVI) with U-net to improve the quality of video inpainting efficiently. We modify the vanilla Swin Transformer by extending the standard self-attention mechanism to a 3D self-attention mechanism, which enables the modified model to process spatio-temporal information simultaneously. SwinVI consists of U-net implemented by 3D Patch Merge and CNN-equipped upsampling module, which provides an end-to-end learning framework. This structural design empowers SwinVI to fully focus on background textures and moving objects to learn robust and more representative token vectors. Accordingly, to significantly improve the quality of video inpainting efficiently. We experimentally compare SwinVI with multiple methods on two challenging benchmarks. Experimental results demonstrate that the proposed SwinVI outperforms the state-of-the-art methods in RMSE, SSIM, and PSNR.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] A Multi Brain Tumor Region Segmentation Model Based on 3D U-Net
    Li, Zhenwei
    Wu, Xiaoqin
    Yang, Xiaoli
    APPLIED SCIENCES-BASEL, 2023, 13 (16):
  • [32] Comparison of tissue segmentation performance between 2D U-Net and 3D U-Net on brain MR Images
    Woo, Boyeong
    Lee, Myungeun
    2021 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2021,
  • [33] Video Watermarking Method Based on 3D U-Net Robust Against Re-Shooting
    Tsuboyama, Takaharu
    Takahashi, Ryota
    Iwata, Motoi
    Kise, Koichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2025, E108D (04) : 311 - 319
  • [34] 3D U2-Net: A 3D Universal U-Net for Multi-domain Medical Image Segmentation
    Huang, Chao
    Han, Hu
    Yao, Qingsong
    Zhu, Shankuan
    Zhou, S. Kevin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 291 - 299
  • [35] A U-Net Architecture for Inpainting Lightstage Normal Maps
    Zuo, Hancheng
    Tiddeman, Bernard
    COMPUTERS, 2024, 13 (02)
  • [36] Hybrid U-Net and Swin-transformer network for limited-angle cardiac computed tomography
    Xu, Yongshun
    Han, Shuo
    Wang, Dayang
    Wang, Ge
    Maltz, Jonathan S.
    Yu, Hengyong
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (10):
  • [37] ESTUGAN: Enhanced Swin Transformer with U-Net Discriminator for Remote Sensing Image Super-Resolution
    Yu, Chunhe
    Hong, Lingyue
    Pan, Tianpeng
    Li, Yufeng
    Li, Tingting
    ELECTRONICS, 2023, 12 (20)
  • [38] MSU-Net: Multiscale Statistical U-Net for Real-Time 3D Cardiac MRI Video Segmentation
    Wang, Tianchen
    Xiong, Jinjun
    Xu, Xiaowei
    Jiang, Meng
    Yuan, Haiyun
    Huang, Meiping
    Zhuang, Jian
    Shi, Yiyu
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 614 - 622
  • [39] Blood Vessel Segmentation Based on the 3D Residual U-Net
    Xin, Mulin
    Wen, Jing
    Wang, Yi
    Yu, Wei
    Fang, Bin
    Hu, Jun
    Xu, Yongmei
    Linghu, Chunhong
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (11)
  • [40] A Hierarchical 3D U-Net for Brain Tumor Substructure Segmentation
    Yang, J.
    Wang, R.
    Weng, Y.
    Chen, L.
    Zhou, Z.
    MEDICAL PHYSICS, 2020, 47 (06) : E568 - E568