Low Complexity In-Loop Filter for VVC Based on Convolution and Transformer

被引:0
|
作者
Feng, Zhen [1 ]
Jung, Cheolkon [1 ]
Zhang, Hao [1 ]
Liu, Yang [2 ]
Li, Ming [2 ]
机构
[1] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
[2] Guangdong OPPO Mobile Telecommun Corp Ltd, Dongguan 523860, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Transformers; Convolutional neural networks; Artificial neural networks; Feature extraction; Training; Image coding; Video coding; Versatile video coding; compression artifacts; in-loop filter; convolutional neural network; transformer; VIDEO; CNN;
D O I
10.1109/ACCESS.2024.3438988
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Joint Video Experts Team (JVET) has explored neural network-based video coding (NNVC) and is trying to introduce NNVC into the versatile video coding (VVC). In NNVC, the NN-based in-loop filter is the most active area, which is very close to deployment of software. Recent NN-based in-loop filters start adopting Transformer to capture context information, but it causes a remarkable increase of complexity to about 1000 kMAC/Pixel. In this paper, we propose a low complexity NN-based in-loop filter for VVC based on convolution and Transformer, named ConvTransNet. ConvTransNet adopts a pyramid structure in feature extraction to capture both global contextual information and local details at multiple scales. Moreover, ConvTransNet combines convolutional neural network (CNN) and Transformer into the in-loop filter. CNN captures local features and reduces compression artifacts in an image, while Transformer captures long-range spatial dependency and enhances global structures in an image. Thus, ConvTransNet enables the NN-based in-loop filter to reduce compression artifacts and enhance visual quality in an image. In ConvTransNet, we use grouped convolutions in CNN and depthwise convolutions in Transformer to reduce the network complexity. Therefore, ConvTransNet successfully captures both local spatial structure and global contextual information in an image and achieves outstanding performance in terms of BD-rate and complexity. Experimental results show that the proposed NN-based in-loop filter based on ConvTransNet achieves average {6.58%, 23.02%, 23.04%} and {8.18%, 22.67%, 22.00%} BD-rate reductions for {Y, U, V} channels over VTM_11.0-NNVC_2.0 anchor under AI and RA configurations, respectively.
引用
收藏
页码:120316 / 120325
页数:10
相关论文
共 50 条
  • [1] Swin Transformer-based In-Loop Filter for VVC Intra Coding
    Tong, Ouyang
    Chen, Xin
    Wang, Huairui
    Zhu, Han
    Chen, Zhenzhong
    2024 PICTURE CODING SYMPOSIUM, PCS 2024, 2024,
  • [2] RTNN: A Neural Network-Based In-Loop Filter in VVC Using Resblock and Transformer
    Zhang, Hao
    Liu, Yunfeng
    Jung, Cheolkon
    Liu, Yang
    Li, Ming
    IEEE ACCESS, 2024, 12 : 104599 - 104610
  • [3] One-for-All: An Efficient Variable Convolution Neural Network for In-Loop Filter of VVC
    Huang, Zhijie
    Sun, Jun
    Guo, Xiaopeng
    Shang, Mingyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2342 - 2355
  • [4] Adaptive Deep Reinforcement Learning-Based In-Loop Filter for VVC
    Huang, Zhijie
    Sun, Jun
    Guo, Xiaopeng
    Shang, Mingyu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5439 - 5451
  • [5] CONVOLUTIONAL NEURAL NETWORK BASED IN-LOOP FILTER FOR VVC INTRA CODING
    Li, Yue
    Zhang, Li
    Zhang, Kai
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2104 - 2108
  • [6] LIGHTWEIGHT CNN-BASED IN-LOOP FILTER FOR VVC INTRA CODING
    Zhang, Hao
    Jung, Cheolkon
    Liu, Yang
    Li, Ming
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1635 - 1639
  • [7] VVC In-Loop Filters
    Karczewicz, Marta
    Hu, Nan
    Taquet, Jonathan
    Chen, Ching-Yeh
    Misra, Kiran
    Andersson, Kenneth
    Yin, Peng
    Lu, Taoran
    Francois, Edouard
    Chen, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (10) : 3907 - 3925
  • [8] Multi-Stage Spatial and Frequency Feature Fusion using Transformer in CNN-Based In-Loop Filter for VVC
    Kathariya, Birendra
    Li, Zhu
    Wang, Hongtao
    Coban, Mohammad
    2022 PICTURE CODING SYMPOSIUM (PCS), 2022, : 373 - 377
  • [9] DQT-CALF: Content adaptive neural network based In-Loop filter in VVC using dual query transformer
    Liu, Yunfeng
    Jung, Cheolkon
    NEUROCOMPUTING, 2025, 637
  • [10] MULTI-GRADIENT CONVOLUTIONAL NEURAL NETWORK BASED IN-LOOP FILTER FOR VVC
    Huang, Zhijie
    Li, Yunchang
    Sun, Jun
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,