Efficient video quality assessment with deeper spatiotemporal feature extraction and integration

被引:5
|
作者
Liu, Yinhao [1 ]
Zhou, Xiaofei [2 ]
Yin, Haibing [1 ]
Wang, Hongkui [1 ]
Yan, Chenggang [2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou, Peoples R China
关键词
video quality assessment; no-reference/blind; user-generated content; deep learning; deeper temporal correlation; reverse hierarchy theory;
D O I
10.1117/1.JEI.30.6.063034
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The challenge of video quality assessment (VQA) modeling for user-generated content (UGC) (i.e., UGC-VQA) is how to accurately extract discriminative features and elaborately quantify interfeature interactions by following the behavior patterns of human eye-brain vision perception. To address this issue, we propose the Deeper Spatial-Temporal Scoring Network (DSTS-Net) to give a precise VQA. Concretely, we first deploy the multiscale feature extraction module to characterize content-aware features accounting for nonlinear reverse hierarchy theory in video perception process, which is not fully considered in the reported UGC-VQA models. Hierarchical handcraft and semantic features are simultaneously considered using content adaptive weighting. Second, we develop a feature integration structure, i.e., deeper gated recurrent unit (DGRU), to fully imitate the interfeature interactions in visionary perception, including feedforward and feedback processes. Third, the dual DGRU structure is employed to further account for interframe interactions of hierarchical features, imitating the nonlinearity of perception as much as possible. Finally, improved pooling is achieved in the local adaptive smoothing module accounting for the temporal hysteresis. Holistic validation of the proposed method on four public challenging UGC-VQA datasets presents a comparable performance over the state-of-the-art no-reference VQA methods, especially our method can give an accurate prediction of the low quality videos with weak temporal correlation. To promote reproducible research and public evaluation, an implementation of our method has been made available online: https://github.com/liu0527aa/DSTS-Net . (C) 2021 SPIE and IS&T
引用
收藏
页数:21
相关论文
共 50 条
  • [21] A Gabor feature-based full reference video quality assessment model based on spatiotemporal slice of videos
    Bediako, Daniel Oppong
    Mou, Xuanqin
    Suobogbiree, Maxwell
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1621 - 1630
  • [22] A Gabor feature-based full reference video quality assessment model based on spatiotemporal slice of videos
    Daniel Oppong Bediako
    Xuanqin Mou
    Maxwell Suobogbiree
    Signal, Image and Video Processing, 2023, 17 : 1621 - 1630
  • [23] Spatiotemporal Feature Fusion for Video Summarization
    Kashid, Shamal
    Awasthi, Lalit K.
    Berwal, Krishan
    Saini, Parul
    IEEE MULTIMEDIA, 2024, 31 (03) : 88 - 97
  • [24] A framework for computationally efficient video quality assessment
    Akamine, Welington Y. L.
    Freitas, Pedro Garcia
    Farias, Mylene C. Q.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2019, 70 : 57 - 67
  • [25] Efficient VR Video Representation and Quality Assessment
    Wu, Shilin
    Chen, Xiaoming
    Fu, Jun
    Chen, Zhibo
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 57 : 107 - 117
  • [26] Efficient Models for Objective Video Quality Assessment
    Javurek, Radim
    RADIOENGINEERING, 2004, 13 (04) : 48 - 50
  • [27] FEATURE-BASED VIDEO KEY FRAME EXTRACTION FOR LOW QUALITY VIDEO SEQUENCES
    Kelm, Pascal
    Schmiedeke, Sebastian
    Sikora, Thomas
    2009 10TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES, 2009, : 25 - 28
  • [28] Robust Feature Extraction for Facial Image Quality Assessment
    Thi Hai Binh Nguyen
    Van Huan Nguyen
    Kim, Hakil
    INFORMATION SECURITY APPLICATIONS, 2011, 6513 : 292 - 306
  • [29] A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention
    Shen, Wenhao
    Zhou, Mingliang
    Wei, Xuekai
    Wang, Heqiang
    Fang, Bin
    Ji, Cheng
    Zhuang, Xu
    Wang, Jason
    Luo, Jun
    Pu, Huayan
    Huang, Xiaoxu
    Wang, Shilong
    Cao, Huajun
    Feng, Yong
    Xiang, Tao
    Shang, Zhaowei
    IEEE TRANSACTIONS ON BROADCASTING, 2024, 70 (01) : 251 - 264
  • [30] A SPATIOTEMPORAL MOST-APPARENT-DISTORTION MODEL FOR VIDEO QUALITY ASSESSMENT
    Vu, Phong V.
    Vu, Cuong T.
    Chandler, Damon M.
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011,