STD-Net: Spatio-Temporal Decomposition Network for Video Demoiring With Sparse Transformers

被引:0
|
作者
Niu, Yuzhen [1 ,2 ]
Xu, Rui [1 ,2 ]
Lin, Zhihua [3 ]
Liu, Wenxi [1 ,2 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informa, Fuzhou 350108, Peoples R China
[2] Minist Educ, Engn Res Ctr Bigdata Intelligence, Fuzhou 350108, Peoples R China
[3] Res Inst Alipay Informat Technol Co Ltd, Hangzhou 310000, Peoples R China
基金
中国国家自然科学基金;
关键词
Image restoration; video demoireing; video restoration; spatio-temporal network; sparse transformer; QUALITY ASSESSMENT; IMAGE;
D O I
10.1109/TCSVT.2024.3386604
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The problem of video demoireing is a new challenge in video restoration. Unlike image demoireing, which involves removing static and uniform patterns, video demoireing requires tackling dynamic and varied moire patterns while maintaining video details, colors, and temporal consistency. It is particularly challenging to model moire patterns for videos with camera or object motions, where separating moire from the original video content across frames is extremely difficult. Nonetheless, we observe that the spatial distribution of moire patterns is often sparse on each frame, and their long-range temporal correlation is not significant. To fully leverage this phenomenon, a sparsity-constrained spatial self-attention scheme is proposed to concentrate on removing sparse moire efficiently for each frame without being distracted by dynamic video content. The frame-wise spatial features are then correlated and aggregated via the local temporal cross-frame-attention module to produce temporal-consistent high-quality moire-free videos. The above decoupled spatial and temporal transformers constitute the Spatio-Temporal Decomposition Network, dubbed STD-Net. For evaluation, we present a large-scale video demoireing benchmark featuring various real-life scenes, camera motions, and object motions. We demonstrate that our proposed model can effectively and efficiently achieve superior performance on video demoireing and single image demoireing tasks. The proposed dataset is released at https://github.com/FZU-N/LVDM.
引用
收藏
页码:8562 / 8575
页数:14
相关论文
共 50 条
  • [31] Video object segmentation using spatio-temporal deep network
    Ramaswamy, Akshaya
    Gubbi, Jayavardhana
    Balamuralidhar, P.
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [32] SPATIO-TEMPORAL MOTION AGGREGATION NETWORK FOR VIDEO ACTION DETECTION
    Zhang, Hongcheng
    Zhao, Xu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2180 - 2184
  • [33] A novel spatio-temporal memory network for video anomaly detection
    Li H.
    Chen M.
    Multimedia Tools and Applications, 2025, 84 (8) : 4603 - 4624
  • [34] Unsupervised Video Prediction Network with Spatio-temporal Deep Features
    Jin, Beibei
    Zhou, Rong
    Zhang, Zhisheng
    Dai, Min
    PROCEEDINGS OF THE 2018 25TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE (M2VIP), 2018, : 19 - 24
  • [35] Spatio-Temporal Fusion Network for Video Super-Resolution
    Li, Huabin
    Zhang, Pingjian
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [36] Spatio-temporal Prompting Network for Robust Video Feature Extraction
    Sun, Guanxiong
    Wang, Chi
    Zhang, Zhaoyu
    Deng, Jiankang
    Zafeiriou, Stefanos
    Hua, Yang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13541 - 13551
  • [37] MULTISCALE SPATIO-TEMPORAL NETWORK FOR AERIAL VIDEO EVENT RECOGNITION
    Yang, Feng
    Zhang, Jian
    Zhao, Yue
    Qin, Anyong
    Gao, Chenqiang
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 7835 - 7838
  • [38] Dynamic Spatio-Temporal Modular Network for Video Question Answering
    Qian, Zi
    Wang, Xin
    Duan, Xuguang
    Chen, Hong
    Zhu, Wenwu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4466 - 4477
  • [39] Sparse Representation With Spatio-Temporal Online Dictionary Learning for Promising Video Coding
    Dai, Wenrui
    Shen, Yangmei
    Tang, Xin
    Zou, Junni
    Xiong, Hongkai
    Chen, Chang Wen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (10) : 4580 - 4595
  • [40] MODELING SPARSE SPATIO-TEMPORAL REPRESENTATIONS FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT
    Shabeer, Muhammed P.
    Bhati, Saurabhchand
    Channappayya, Sumohana S.
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1220 - 1224