MT-VQA: A Multi-task Approach for Quality Assessment of Short-form Videos

被引:0
|
作者
Wen, Shijie [1 ]
Qiao, Minglang [1 ]
Jiang, Lai [1 ]
Xu, Mai [1 ]
Deng, Xin [1 ]
Li, Shengxi [1 ]
机构
[1] Beihang Univ, Beijing, Peoples R China
基金
北京市自然科学基金;
关键词
Short-form video; video quality assessment; human attention; SALIENCY; PREDICTION;
D O I
10.1145/3689093.3689181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Short-form video, as a mainstream media form on video platforms, has undergone explosive growth in recent years. A vast number of short-form videos are produced, processed, and distributed to users each day, inevitably leading to quality degradation. Therefore, accurate video quality assessment (VQA) is critical for monitoring and optimizing the viewing experience of users. However, the existing short-form VQA approaches neglect human attention patterns during the viewing of videos. Besides, the advancement of short-form VQA is obstructed by the absence of large-scale datasets. To tackle the above challenges, we first construct a large-scale short-form VQA dataset called SVQA. The SVQA dataset comprises diverse distortion types, covering the typical quality degradations that arise during the photography, encoding, and editing of short-form videos. Besides, for each short-form video in SVQA, we collect both quality score and eye-tracking annotation. Based on our dataset, we propose a two-branch multi-task VQA approach, MT-VQA, in which both tasks of VQA and video saliency prediction (VSP) can be accomplished for short-form videos. We further propose a saliency fusion module to guide the VQA branch to focus on quality distortions within visually attractive regions. Extensive experiments show that our multi-task approach achieves superior performance in both VQA and VSP tasks.
引用
收藏
页码:30 / 38
页数:9
相关论文
共 50 条
  • [21] A MULTI-TASK ARCHITECTURE FOR REMOTE SENSING BY JOINT SCENE CLASSIFICATION AND IMAGE QUALITY ASSESSMENT
    Zhang, Cong
    Wang, Qi
    Li, Xuelong
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 10051 - 10054
  • [22] InQSS: a speech intelligibility and quality assessment model using a multi-task learning network
    Chen, Yu-Wen
    Tsao, Yu
    INTERSPEECH 2022, 2022, : 3088 - 3092
  • [23] MTQ-Caps: A Multi-task Capsule Network for Blind Image Quality Assessment
    Wei, Yijie
    Wang, Bincheng
    Liang, Fangfang
    Liu, Bo
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 296 - 308
  • [24] Multi-task Fundus Image Quality Assessment via Transfer Learning and Landmarks Detection
    Shen, Yaxin
    Fang, Ruogu
    Sheng, Bin
    Dai, Ling
    Li, Huating
    Qin, Jing
    Wu, Qiang
    Jia, Weiping
    MACHINE LEARNING IN MEDICAL IMAGING: 9TH INTERNATIONAL WORKSHOP, MLMI 2018, 2018, 11046 : 28 - 36
  • [25] Automatic tongue image quality assessment using a multi-task deep learning model
    Xian, Huimin
    Xie, Yanyan
    Yang, Zizhu
    Zhang, Linzi
    Li, Shangxuan
    Shang, Hongcai
    Zhou, Wu
    Zhang, Honglai
    FRONTIERS IN PHYSIOLOGY, 2022, 13
  • [26] No-Reference Image Quality Assessment Based on Multi-Task Generative Adversarial Network
    Ma, Yao
    Cai, Xibiao
    Sun, Fuming
    Hao, Shijie
    IEEE ACCESS, 2019, 7 : 146893 - 146902
  • [27] Remote Sensing forWater Quality: A Multi-Task, Metadata-Driven Hypernetwork Approach
    Graffeuille, Olivier
    Koh, Yun Sing
    Wicker, Jorg
    Lehmann, Moritz
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 7287 - 7295
  • [28] Viewport-Based CNN: A Multi-Task Approach for Assessing 360° Video Quality
    Xu, Mai
    Jiang, Lai
    Li, Chen
    Wang, Zulin
    Tao, Xiaoming
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (04) : 2198 - 2215
  • [29] An alternative approach to short-form self-report assessment of competitive anxiety: A research note.
    Thomas, O
    Hanton, S
    Jones, G
    INTERNATIONAL JOURNAL OF SPORT PSYCHOLOGY, 2002, 33 (03) : 325 - 336
  • [30] SPFusion: A multi-task semantic perception infrared and visible light fusion method with quality assessment☆
    Liang, Zhenyang
    Yu, Mingxin
    Sun, Yichen
    Dong, Mingli
    DISPLAYS, 2025, 87