Multi-feature fusion for efficient inter prediction in versatile video coding

被引：1

作者：

Wei, Xiaojie ^{[1
]}

Zeng, Hongji ^{[1
]}

Fang, Ying ^{[1
]}

Lin, Liqun ^{[1
]}

Chen, Weiling ^{[1
]}

Xu, Yiwen ^{[1
]}

机构：

[1] Fuzhou Univ, Fuzhou Coll Town, Fujian Key Lab Intelligent Proc & Wireless Transmi, 2 North Wulong River Ave, Fuzhou, Fujian, Peoples R China

来源：

JOURNAL OF REAL-TIME IMAGE PROCESSING | 2024年 / 21卷 / 06期

关键词：

Versatile video coding; Complexity optimization; Block partition; CNN; Multi-feature fusion; CU PARTITION; OPTIMIZATION; DECISION;

D O I：

10.1007/s11554-024-01564-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods.

引用

页数：14

共 50 条

[1] Joint multi-feature fast coding for future video coding
Cui X.
Peng Z.-J.
Chen F.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2019, 27 (04): : 990 - 999
[2] Gated fusion network for SAO filter and inter frame prediction in Versatile Video Coding
Kuanar, Shiba
Athitsos, Vassilis
Mahapatra, Dwarikanath
Rao, K. R.
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 109
[3] Video text detection based on multi-feature fusion
Xiao, Bing
Zhao, Jing
Zhao, Cong
Ma, Junliang
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (02) : 2125 - 2136
[4] Multi-feature fusion refine network for video captioning
Wang, Guan-Hong
Du, Ji-Xiang
Zhang, Hong-Bo
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (03) : 483 - 497
[5] Semantic Enhanced Video Captioning with Multi-feature Fusion
Niu, Tian-Zi
Dong, Shan-Shan
Chen, Zhen-Duo
Luo, Xin
Guo, Shanqing
Huang, Zi
Xu, Xin-Shun
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
[6] Video Captioning based on Multi-feature Fusion with Object
Zhou, Lijuan
Liu, Tao
Niu, Changyong
THIRTEENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2021), 2021, 11878
[7] Multi-Zone Division-Based Inter Prediction for Versatile Video Coding
Yuan, Zikun
Tang, Xiaohu
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[8] A flame detection algorithm based on video multi-feature fusion
Zhang, Jinhua
Zhuang, Jian
Du, Haifeng
Wang, Sun'an
Li, Xiaohu
ADVANCES IN NATURAL COMPUTATION, PT 2, 2006, 4222 : 784 - 792
[9] Flame detection algorithm based on video multi-feature fusion
School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an 710049, China
Hsi An Chiao Tung Ta Hsueh, 2006, 7 (811-814):
[10] Forest Fire Detection Based on Video Multi-Feature Fusion
Jie, Li
Jiang, Xiao
2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 2, 2009, : 19 - 22

← 1 2 3 4 5 →