SwinVI:3D Swin Transformer Model with U-net for Video Inpainting

被引:0
|
作者
Zhang, Wei [1 ]
Cao, Yang [1 ]
Zhai, Junhai [1 ]
机构
[1] Hebei Univ, Coll Math & Informat Sci, Hebei Key Lab Machine Learning & Computat Intelli, Baoding, Peoples R China
关键词
Transformer; Video inpainting; Spatio-temporal;
D O I
10.1109/IJCNN54540.2023.10192024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of video inpainting is to fill in the local missingness of a given video as realistic as possible, it remains a challenging task, even with powerful deep learning methods. In recent years, Transformer has been introduced to video inpainting, and remarkable improvement has been achieved. However, it still suffers from the problems of generating blurry texture and requiring high computational cost. To address the two problems, we propose a new 3D Swin Transformer model (SwinVI) with U-net to improve the quality of video inpainting efficiently. We modify the vanilla Swin Transformer by extending the standard self-attention mechanism to a 3D self-attention mechanism, which enables the modified model to process spatio-temporal information simultaneously. SwinVI consists of U-net implemented by 3D Patch Merge and CNN-equipped upsampling module, which provides an end-to-end learning framework. This structural design empowers SwinVI to fully focus on background textures and moving objects to learn robust and more representative token vectors. Accordingly, to significantly improve the quality of video inpainting efficiently. We experimentally compare SwinVI with multiple methods on two challenging benchmarks. Experimental results demonstrate that the proposed SwinVI outperforms the state-of-the-art methods in RMSE, SSIM, and PSNR.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Brain Tumor Segmentation Based on 3D Residual U-Net
    Bhalerao, Megh
    Thakur, Siddhesh
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2019), PT II, 2020, 11993 : 218 - 225
  • [42] CHOROID PLEXUS SEGMENTATION USING OPTIMIZED 3D U-NET
    Zhao, Li
    Feng, Xue
    Meyer, Craig H.
    Alsop, David C.
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 381 - 384
  • [43] Automatic brain tumor segmentation from Multiparametric MRI based on cascaded 3D U-Net and 3D U-Net++
    Li, Pengyu
    Wu, Wenhao
    Liu, Lanxiang
    Serry, Fardad Michael
    Wang, Jinjia
    Han, Hui
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78
  • [44] A Bispectral 3D U-Net for Rotation Robustness in Medical Segmentation
    Chevalley, Arthur
    Oreiller, Valentin
    Fageot, Julien
    Prior, John O.
    Andrearczyk, Vincent
    Depeursinge, Adrien
    TOPOLOGY-AND GRAPH-INFORMED IMAGING INFORMATICS, TGI3 2024, 2025, 15239 : 43 - 54
  • [45] Residual 3D U-Net with Localization for Brain Tumor Segmentation
    Demoustier, Marc
    Khemir, Ines
    Nguyen, Quoc Duong
    Martin-Gaffe, Lucien
    Boutry, Nicolas
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 : 389 - 399
  • [46] SEGMENTATION OF SPINAL SUBARACHNOID LUMEN WITH 3D ATTENTION U-NET
    Keles, Ayse
    Algin, Oktay
    Ozisik, Pinar Akdemir
    Sen, Baha
    Celebi, Fatih Vehbi
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2023, 23 (04)
  • [47] Segmentation of Liver Anatomy by Combining 3D U-Net Approaches
    Affane, Abir
    Kucharski, Adrian
    Chapuis, Paul
    Freydier, Samuel
    Lebre, Marie-Ange
    Vacavant, Antoine
    Fabijanska, Anna
    APPLIED SCIENCES-BASEL, 2021, 11 (11):
  • [48] LIVER VESSELS SEGMENTATION BASED ON 3D RESIDUAL U-NET
    Yu, Wei
    Fang, Bin
    Liu, Yongqing
    Gao, Mingqi
    Zheng, Shenhai
    Wang, Yi
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 250 - 254
  • [49] LOW DOSE CBCT DENOISING USING A 3D U-NET
    Yunker, A. Austin
    Kettimuthu, B. Rajkumar
    Roeske, C. John C.
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 85 - 86
  • [50] Attention U-Net Oriented Towards 3D Depth Estimation
    Ocsa Sanchez, Leonel Jaime
    Gutierrez Caceres, Juan Carlos
    INTELLIGENT COMPUTING, VOL 3, 2024, 2024, 1018 : 466 - 483