SwinTExCo: Exemplar-based video colorization using Swin Transformer

被引:0
|
作者
Tran, Duong Thanh [1 ]
Nguyen, Nguyen Doan Hieu [1 ]
Pham, Trung Thanh [1 ]
Tran, Phuong-Nam [2 ]
Vu, Thuy-Duong Thi [1 ]
Nguyen, Cuong Tuan [3 ]
Dang-Ngoc, Hanh [4 ]
Dang, Duc Ngoc Minh [1 ]
机构
[1] FPT Univ, Long Thanh My Ward, Dept Comp Fundamental, AiTA Lab, D1 St,Saigon Hi Tech Pk, Ho Chi Minh City 71216, Vietnam
[2] Kyung Hee Univ, Dept Comp Sci & Engn, Yongin 446701, South Korea
[3] Vietnamese German Univ, Thoi Hoa Ward, Fac Engn, Ring Rd 4,Quarter 4, Ben Cat 75000, Binh Duong, Vietnam
[4] Ho Chi Minh City Univ Technol HCMUT, Fac Elect & Elect Engn, VNU HCM, 268 Ly Thuong Kiet,Dist 10, Ho Chi Minh City 72506, Vietnam
关键词
Computer vision; Image colorization; Video colorization; Exemplar-based; Vision transformer; Swin transformer; IMAGE;
D O I
10.1016/j.eswa.2024.125437
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video colorization represents a compelling domain within the field of Computer Vision. The traditional approach in this field relies on Convolutional Neural Networks (CNNs) to extract features from each video frame and employs a recurrent network to learn information between video frames. While demonstrating considerable success in colorization, most traditional CNNs suffer from a limited receptive field size, capturing local information within a fixed-sized window. Consequently, they struggle to directly grasp long-range dependencies or pixel relationships that span large image or video frame areas. To address this limitation, recent advancements in the field have leveraged Vision Transformer (ViT) and their variants to enhance performance. This article introduces Swin Transformer Exemplar-based Video Colorization (SwinTExCo), an end-to-end model for the video colorization process that incorporates the Swin Transformer architecture as the backbone. The experimental results demonstrate that our proposed method outperforms many other state-ofthe-art methods in both quantitative and qualitative metrics. The achievements of this research have significant implications for the domain of documentary and history video restoration, contributing to the broader goal of preserving cultural heritage and facilitating a deeper understanding of historical events through enhanced audiovisual materials.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Exemplar-based face recognition from video
    Krüger, V
    Zhou, SH
    FIFTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2002, : 182 - 187
  • [12] Exemplar-based video inpainting with large patches
    Koochari, Abbas
    Soryani, Mohsen
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2010, 11 (04): : 270 - 277
  • [13] Exemplar-based video inpainting with large patches
    Abbas Koochari
    Mohsen Soryani
    Journal of Zhejiang University SCIENCE C, 2010, 11 : 270 - 277
  • [14] BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature Fusion for Deep Exemplar-Based Video Colorization
    Yang, Yixin
    Pan, Jinshan
    Peng, Zhongzheng
    Du, Xiaoyu
    Tao, Zhulin
    Tang, Jinhui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5612 - 5624
  • [15] Exemplar-based video inpainting with large patches
    Abbas KOOCHARI
    Mohsen SORYANI
    Frontiers of Information Technology & Electronic Engineering, 2010, (04) : 270 - 277
  • [16] Exemplar-based face recognition from video
    Krüger, V
    Zhou, SH
    COMPUTER VISION - ECCV 2002, PT IV, 2002, 2353 : 732 - 746
  • [17] Exemplar-Based Sketch Colorization with Cross-Domain Dense Semantic Correspondence
    Cui, Jinrong
    Zhong, Haowei
    Liu, Hailong
    Fu, Yulu
    MATHEMATICS, 2022, 10 (12)
  • [18] Globally and Locally Semantic Colorization via Exemplar-Based Broad-GAN
    Li, Haoxuan
    Sheng, Bin
    Li, Ping
    Ali, Riaz
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8526 - 8539
  • [19] Exemplar-Based Image and Video Stylization Using Fully Convolutional Semantic Features
    Zhu, Feida
    Yan, Zhicheng
    Bu, Jiajun
    Yu, Yizhou
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) : 3542 - 3555
  • [20] Video Swin Transformer
    Liu, Ze
    Ning, Jia
    Cao, Yue
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Hu, Han
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3192 - 3201