Multi-Stage Spatial and Frequency Feature Fusion using Transformer in CNN-Based In-Loop Filter for VVC

被引:4
|
作者
Kathariya, Birendra [1 ]
Li, Zhu [1 ]
Wang, Hongtao [2 ]
Coban, Mohammad [2 ]
机构
[1] Univ Missouri, Kansas City, MO 64110 USA
[2] Qualcomm Technol Inc, San Diego, CA USA
关键词
Versatile Video Coding (VVC); In-Loop Filter; Discrete Cosine Transform (DCT); Convolutional Neural Network; Transformer;
D O I
10.1109/PCS56426.2022.10017998
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Versatile Video Coding (VVC)/H.266 is a video coding successor to High Efficiency Video Coding (HEVC)/H.255 and Advanced Video Coding (AVC)/H.264 with significant technical and coding improvement. Nonetheless, it follows the conventional block-based hybrid video coding scheme similar to its predecessors. The consequence is, that the reconstructed picture contains compression artifacts. VVC, by default, has in-loop filters to correct the deformities but these handcrafted filters offer suboptimal performance. In this work, we designed a novel convolutional neural network (CNN) to replace the inbuilt in-loop filter of VVC. The proposed CNN-based in-loop filter utilizes a modified Spectral-wise Multi-Head Self-Attention (S-MSA) layer of Multi-stage Spectral-wise Transformer (MST++) at multiple stages to fuse spatial and frequency-decomposed features extracted from pixel and its discrete-cosine-transform (DCT) applied input respectively. We named the proposed network MSTFNet where the first three letters represent MST++ and F stands for fusion. Because of the multi-stage feature fusion operation, the proposed CNN acts as a powerful learned in-loop filter that significantly outperforms previous methods. Our experimental results show that the proposed method can achieve coding improvements up to 10.31% on average Bjontegaard Delta (BD)-Bitrate savings under all-intra (AI) configurations for the luma (Y) component.
引用
收藏
页码:373 / 377
页数:5
相关论文
共 50 条
  • [1] LIGHTWEIGHT CNN-BASED IN-LOOP FILTER FOR VVC INTRA CODING
    Zhang, Hao
    Jung, Cheolkon
    Liu, Yang
    Li, Ming
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1635 - 1639
  • [2] Multi-stage Locally and Long-range Correlated Feature Fusion for Learned In-loop Filter in VVC
    Kathariya, Birendra
    Li, Zhu
    Wang, Hongtao
    Van der Auwera, Geert
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [3] Low Complexity In-Loop Filter for VVC Based on Convolution and Transformer
    Feng, Zhen
    Jung, Cheolkon
    Zhang, Hao
    Liu, Yang
    Li, Ming
    IEEE ACCESS, 2024, 12 : 120316 - 120325
  • [4] A CNN-Based In-Loop Filter with CU Classification for HEVC
    Dai, Yuanying
    Liu, Dong
    Zha, Zheng-Jun
    Wu, Feng
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [5] Swin Transformer-based In-Loop Filter for VVC Intra Coding
    Tong, Ouyang
    Chen, Xin
    Wang, Huairui
    Zhu, Han
    Chen, Zhenzhong
    2024 PICTURE CODING SYMPOSIUM, PCS 2024, 2024,
  • [6] RTNN: A Neural Network-Based In-Loop Filter in VVC Using Resblock and Transformer
    Zhang, Hao
    Liu, Yunfeng
    Jung, Cheolkon
    Liu, Yang
    Li, Ming
    IEEE ACCESS, 2024, 12 : 104599 - 104610
  • [7] A nonlocal HEVC in-loop filter using CNN-based compression noise estimation
    Weiheng Sun
    Xiaohai He
    Honggang Chen
    Shuhua Xiong
    Yifei Xu
    Applied Intelligence, 2022, 52 : 17810 - 17828
  • [8] A nonlocal HEVC in-loop filter using CNN-based compression noise estimation
    Sun, Weiheng
    He, Xiaohai
    Chen, Honggang
    Xiong, Shuhua
    Xu, Yifei
    APPLIED INTELLIGENCE, 2022, 52 (15) : 17810 - 17828
  • [9] Joint Pixel and Frequency Feature Learning and Fusion via Channel-Wise Transformer for High-Efficiency Learned In-Loop Filter in VVC
    Kathariya, Birendra
    Li, Zhu
    Van der Auwera, Geert
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 4070 - 4083
  • [10] Quality-aware CNN-based in-loop filter for Video Coding
    Chen, Wei
    Xiu, Xiaoyu
    Wang, Xianglin
    Chen, Yi-Wen
    Jhu, Hong-Jheng
    Kuo, Che-Wei
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XLIV, 2021, 11842