Efficient video quality assessment with deeper spatiotemporal feature extraction and integration

被引：5

作者：

Liu, Yinhao ^{[1
]}

Zhou, Xiaofei ^{[2
]}

Yin, Haibing ^{[1
]}

Wang, Hongkui ^{[1
]}

Yan, Chenggang ^{[2
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou, Peoples R China

[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2021年 / 30卷 / 06期

关键词：

video quality assessment; no-reference/blind; user-generated content; deep learning; deeper temporal correlation; reverse hierarchy theory;

D O I：

10.1117/1.JEI.30.6.063034

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The challenge of video quality assessment (VQA) modeling for user-generated content (UGC) (i.e., UGC-VQA) is how to accurately extract discriminative features and elaborately quantify interfeature interactions by following the behavior patterns of human eye-brain vision perception. To address this issue, we propose the Deeper Spatial-Temporal Scoring Network (DSTS-Net) to give a precise VQA. Concretely, we first deploy the multiscale feature extraction module to characterize content-aware features accounting for nonlinear reverse hierarchy theory in video perception process, which is not fully considered in the reported UGC-VQA models. Hierarchical handcraft and semantic features are simultaneously considered using content adaptive weighting. Second, we develop a feature integration structure, i.e., deeper gated recurrent unit (DGRU), to fully imitate the interfeature interactions in visionary perception, including feedforward and feedback processes. Third, the dual DGRU structure is employed to further account for interframe interactions of hierarchical features, imitating the nonlinearity of perception as much as possible. Finally, improved pooling is achieved in the local adaptive smoothing module accounting for the temporal hysteresis. Holistic validation of the proposed method on four public challenging UGC-VQA datasets presents a comparable performance over the state-of-the-art no-reference VQA methods, especially our method can give an accurate prediction of the low quality videos with weak temporal correlation. To promote reproducible research and public evaluation, an implementation of our method has been made available online: https://github.com/liu0527aa/DSTS-Net . (C) 2021 SPIE and IS&T

引用

页数：21

共 50 条

[21] A Gabor feature-based full reference video quality assessment model based on spatiotemporal slice of videos
Bediako, Daniel Oppong
Mou, Xuanqin
Suobogbiree, Maxwell
SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1621 - 1630
[22] A Gabor feature-based full reference video quality assessment model based on spatiotemporal slice of videos
Daniel Oppong Bediako
Xuanqin Mou
Maxwell Suobogbiree
Signal, Image and Video Processing, 2023, 17 : 1621 - 1630
[23] Spatiotemporal Feature Fusion for Video Summarization
Kashid, Shamal
Awasthi, Lalit K.
Berwal, Krishan
Saini, Parul
IEEE MULTIMEDIA, 2024, 31 (03) : 88 - 97
[24] A framework for computationally efficient video quality assessment
Akamine, Welington Y. L.
Freitas, Pedro Garcia
Farias, Mylene C. Q.
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2019, 70 : 57 - 67
[25] Efficient VR Video Representation and Quality Assessment
Wu, Shilin
Chen, Xiaoming
Fu, Jun
Chen, Zhibo
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 57 : 107 - 117
[26] Efficient Models for Objective Video Quality Assessment
Javurek, Radim
RADIOENGINEERING, 2004, 13 (04) : 48 - 50
[27] FEATURE-BASED VIDEO KEY FRAME EXTRACTION FOR LOW QUALITY VIDEO SEQUENCES
Kelm, Pascal
Schmiedeke, Sebastian
Sikora, Thomas
2009 10TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES, 2009, : 25 - 28
[28] Robust Feature Extraction for Facial Image Quality Assessment
Thi Hai Binh Nguyen
Van Huan Nguyen
Kim, Hakil
INFORMATION SECURITY APPLICATIONS, 2011, 6513 : 292 - 306
[29] A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention
Shen, Wenhao
Zhou, Mingliang
Wei, Xuekai
Wang, Heqiang
Fang, Bin
Ji, Cheng
Zhuang, Xu
Wang, Jason
Luo, Jun
Pu, Huayan
Huang, Xiaoxu
Wang, Shilong
Cao, Huajun
Feng, Yong
Xiang, Tao
Shang, Zhaowei
IEEE TRANSACTIONS ON BROADCASTING, 2024, 70 (01) : 251 - 264
[30] A SPATIOTEMPORAL MOST-APPARENT-DISTORTION MODEL FOR VIDEO QUALITY ASSESSMENT
Vu, Phong V.
Vu, Cuong T.
Chandler, Damon M.
2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011,

← 1 2 3 4 5 →