Multi-feature fusion for efficient inter prediction in versatile video coding

被引:1
|
作者
Wei, Xiaojie [1 ]
Zeng, Hongji [1 ]
Fang, Ying [1 ]
Lin, Liqun [1 ]
Chen, Weiling [1 ]
Xu, Yiwen [1 ]
机构
[1] Fuzhou Univ, Fuzhou Coll Town, Fujian Key Lab Intelligent Proc & Wireless Transmi, 2 North Wulong River Ave, Fuzhou, Fujian, Peoples R China
关键词
Versatile video coding; Complexity optimization; Block partition; CNN; Multi-feature fusion; CU PARTITION; OPTIMIZATION; DECISION;
D O I
10.1007/s11554-024-01564-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Joint multi-feature fast coding for future video coding
    Cui X.
    Peng Z.-J.
    Chen F.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2019, 27 (04): : 990 - 999
  • [2] Gated fusion network for SAO filter and inter frame prediction in Versatile Video Coding
    Kuanar, Shiba
    Athitsos, Vassilis
    Mahapatra, Dwarikanath
    Rao, K. R.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 109
  • [3] Video text detection based on multi-feature fusion
    Xiao, Bing
    Zhao, Jing
    Zhao, Cong
    Ma, Junliang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (02) : 2125 - 2136
  • [4] Multi-feature fusion refine network for video captioning
    Wang, Guan-Hong
    Du, Ji-Xiang
    Zhang, Hong-Bo
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (03) : 483 - 497
  • [5] Semantic Enhanced Video Captioning with Multi-feature Fusion
    Niu, Tian-Zi
    Dong, Shan-Shan
    Chen, Zhen-Duo
    Luo, Xin
    Guo, Shanqing
    Huang, Zi
    Xu, Xin-Shun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [6] Video Captioning based on Multi-feature Fusion with Object
    Zhou, Lijuan
    Liu, Tao
    Niu, Changyong
    THIRTEENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2021), 2021, 11878
  • [7] Multi-Zone Division-Based Inter Prediction for Versatile Video Coding
    Yuan, Zikun
    Tang, Xiaohu
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [8] A flame detection algorithm based on video multi-feature fusion
    Zhang, Jinhua
    Zhuang, Jian
    Du, Haifeng
    Wang, Sun'an
    Li, Xiaohu
    ADVANCES IN NATURAL COMPUTATION, PT 2, 2006, 4222 : 784 - 792
  • [9] Flame detection algorithm based on video multi-feature fusion
    School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
    Hsi An Chiao Tung Ta Hsueh, 2006, 7 (811-814):
  • [10] Forest Fire Detection Based on Video Multi-Feature Fusion
    Jie, Li
    Jiang, Xiao
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 2, 2009, : 19 - 22