STFE-VC: Spatio-temporal feature enhancement for learned video compression

被引:0
|
作者
Wang, Yiming [1 ]
Huang, Qian [1 ,3 ]
Tang, Bin [1 ]
Li, Xin [1 ]
Li, Xing [2 ]
机构
[1] Hohai Univ, Coll Comp Sci & Software Engn, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Forestry Univ, Coll Informat Sci & Technol, Nanjing, Peoples R China
[3] Changzhou Univ, Jiangsu Engn Res Ctr Digital Twinning Technol, Key Equipment Petrochem Proc, Changzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatio-temporal feature enhancement; Learned video compression; Spatio-temporal motion enhancement; In-loop filtering enhancement;
D O I
10.1016/j.eswa.2025.126682
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing growth of video data, limited bandwidth and hardware resource constraints demand more efficient video compression. Current learned video compression methods have shown promising performance. However, these methods mainly rely on the optical flow networks to perform temporal prediction, which may suffer from inaccurate motion estimation and introduce extra artifacts to reconstructed frames. In this paper, we propose a spatio-temporal feature enhancement method for learned video compression to better model the inter-frame motion patterns and reduce compression artifacts. Specifically, we introduce a spatio-temporal motion enhancement module that further extracts the feature representation of original motion vector to enhance corresponding spatial and temporal components. Then, we introduce an in-loop filtering enhancement module that employs cascaded residual blocks to progressively enhance feature textures and provide higher- quality temporal domain reference signals for subsequent reconstruction. More importantly, our proposed method can be integrated into the widely-used residual coding and contextual coding schemes. Comprehensive experiments demonstrate that our integrated methods are superior to the previous learned methods on JCTVC, UVG and MCL-JCV benchmark datasets. In addition, our integrated methods also outperform the latest generalized video coding standard (H.266/VVC) by a larger margin in terms of MS-SSIM metric.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] STFE: A Comprehensive Video-Based Person Re-Identification Network Based on Spatio-Temporal Feature Enhancement
    Yang, Xi
    Wang, Xian
    Liu, Liangchen
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7237 - 7249
  • [2] Dual-frame spatio-temporal feature modulation for video enhancement
    Patil, Prashant W.
    Gupta, Sunil
    Rana, Santu
    Venkatesh, Svetha
    PATTERN RECOGNITION, 2022, 130
  • [3] Dual-frame spatio-temporal feature modulation for video enhancement
    Patil, Prashant W.
    Gupta, Sunil
    Rana, Santu
    Venkatesh, Svetha
    PATTERN RECOGNITION, 2022, 130
  • [4] Spatio-temporal video contrast enhancement
    Celik, Turgay
    IET IMAGE PROCESSING, 2013, 7 (06) : 543 - 555
  • [5] Spatio-Temporal Consistency in Depth Video Enhancement
    Li, Li
    Zhang, Caiming
    JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING, 2013, 7 (05): : 808 - 817
  • [6] Spatio-temporal compression of the motion field in video coding
    Grigoriu, L
    2001 IEEE FOURTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2001, : 129 - 134
  • [7] Video Compression Based on Spatio-Temporal Resolution Adaptation
    Afonso, Mariana
    Zhang, Fan
    Bull, David R.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (01) : 275 - 280
  • [8] Unsupervised Video Hashing by Exploiting Spatio-Temporal Feature
    Ma, Chao
    Gu, Yun
    Liu, Wei
    Yang, Jie
    He, Xiangjian
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT III, 2016, 9949 : 511 - 518
  • [9] DEEP FEATURE COMPRESSION WITH SPATIO-TEMPORAL ARRANGING FOR COLLABORATIVE INTELLIGENCE
    Suzuki, Satoshi
    Takagi, Motohiro
    Takeda, Shoichiro
    Tanida, Ryuichi
    Kimata, Hideaki
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 3099 - 3103
  • [10] The research of video matching algorithm based on spatio-temporal feature
    Jia, Ke-Bin
    Deng, Zhi-Pin
    Zhuang, Xin-Yue
    2007 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL 1, PROCEEDINGS, 2007, : 165 - 168