Low Complexity In-Loop Filter for VVC Based on Convolution and Transformer

被引:0
|
作者
Feng, Zhen [1 ]
Jung, Cheolkon [1 ]
Zhang, Hao [1 ]
Liu, Yang [2 ]
Li, Ming [2 ]
机构
[1] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
[2] Guangdong OPPO Mobile Telecommun Corp Ltd, Dongguan 523860, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Transformers; Convolutional neural networks; Artificial neural networks; Feature extraction; Training; Image coding; Video coding; Versatile video coding; compression artifacts; in-loop filter; convolutional neural network; transformer; VIDEO; CNN;
D O I
10.1109/ACCESS.2024.3438988
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Joint Video Experts Team (JVET) has explored neural network-based video coding (NNVC) and is trying to introduce NNVC into the versatile video coding (VVC). In NNVC, the NN-based in-loop filter is the most active area, which is very close to deployment of software. Recent NN-based in-loop filters start adopting Transformer to capture context information, but it causes a remarkable increase of complexity to about 1000 kMAC/Pixel. In this paper, we propose a low complexity NN-based in-loop filter for VVC based on convolution and Transformer, named ConvTransNet. ConvTransNet adopts a pyramid structure in feature extraction to capture both global contextual information and local details at multiple scales. Moreover, ConvTransNet combines convolutional neural network (CNN) and Transformer into the in-loop filter. CNN captures local features and reduces compression artifacts in an image, while Transformer captures long-range spatial dependency and enhances global structures in an image. Thus, ConvTransNet enables the NN-based in-loop filter to reduce compression artifacts and enhance visual quality in an image. In ConvTransNet, we use grouped convolutions in CNN and depthwise convolutions in Transformer to reduce the network complexity. Therefore, ConvTransNet successfully captures both local spatial structure and global contextual information in an image and achieves outstanding performance in terms of BD-rate and complexity. Experimental results show that the proposed NN-based in-loop filter based on ConvTransNet achieves average {6.58%, 23.02%, 23.04%} and {8.18%, 22.67%, 22.00%} BD-rate reductions for {Y, U, V} channels over VTM_11.0-NNVC_2.0 anchor under AI and RA configurations, respectively.
引用
收藏
页码:120316 / 120325
页数:10
相关论文
共 50 条
  • [21] Low complexity in-loop prediction perceptual video coding for HEVC
    Joshi, Y. G.
    Loo, J.
    Shah, P.
    Rahman, S.
    Tasiran, A.
    Cosmas, J.
    2016 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2016,
  • [22] Geometry Transformation-based Adaptive In-Loop Filter
    Karczewicz, Marta
    Zhang, Li
    Chien, Wei-Jung
    Li, Xiang
    2016 PICTURE CODING SYMPOSIUM (PCS), 2016,
  • [23] Multi-stage Locally and Long-range Correlated Feature Fusion for Learned In-loop Filter in VVC
    Kathariya, Birendra
    Li, Zhu
    Wang, Hongtao
    Van der Auwera, Geert
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [24] IN-LOOP FILTER USING BLOCK-BASED FILTER CONTROL FOR VIDEO CODING
    Watanabe, Takashi
    Wada, Naofumi
    Yasuda, Goki
    Tanizawa, Akiyuki
    Chujoh, Takeshi
    Yamakage, Tomoo
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1013 - 1016
  • [25] In-loop deblocking filter for block-based video coding
    Sun, XY
    Wu, F
    Li, SP
    Gao, W
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 33 - 36
  • [26] A CNN-Based In-Loop Filter with CU Classification for HEVC
    Dai, Yuanying
    Liu, Dong
    Zha, Zheng-Jun
    Wu, Feng
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [27] A LEARNING-BASED LOWCOMPLEXITY IN-LOOP FILTER FOR VIDEO CODING
    Liu, Chao
    Sun, Heming
    Katto, Jiro
    Zeng, Xiaoyang
    Fan, Yibo
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,
  • [28] Neural Network-based In-Loop Filter for CLIC 2022
    Wang, Yonghua
    Zhang, Jingchi
    Li, Zhengang
    Zeng, Xing
    Zhang, Zhen
    Zhang, Diankai
    Long, Yunlin
    Wang, Ning
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1773 - 1776
  • [29] Structure tensor based in-loop filter for depth video coding
    Hu, Jinhui
    Hu, Ruimin
    Wang, Zhongyuan
    ELECTRONICS LETTERS, 2014, 50 (04) : 274 - 275
  • [30] Deep learning based HEVC in-loop filter and noise reduction
    Kuanar, Shiba
    Rao, K. R.
    Conly, Christopher
    Gorey, Ninad
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 99