ROLE OF AUDIO IN VIDEO SUMMARIZATION

被引:0
|
作者
Shoer, Ibrahim [1 ]
Kopru, Berkay [1 ]
Erzin, Engin [1 ]
机构
[1] Koc Univ, Coll Engn, Multimedia Vis & Graph Grp, KUIS AI Lab, Istanbul, Turkiye
关键词
Audio-visual video summarization; canonical correlation analysis;
D O I
10.1109/ICASSPW59220.2023.10192578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Video summarization attracts attention for efficient video representation, retrieval, and browsing to ease volume and traffic surge problems. Although video summarization mostly uses the visual channel for compaction, the benefits of audio-visual modeling appeared in recent literature. The information coming from the audio channel can be a result of audio-visual correlation in the video content. In this study, we propose a new audio-visual video summarization framework integrating four ways of audio-visual information fusion with GRU-based and attention-based networks. Furthermore, we investigate a new explainability methodology using audio-visual canonical correlation analysis (CCA) to better understand and explain the role of audio in the video summarization task. Experimental evaluations on the TVSum dataset attain F1 score and Kendall-tau score improvements for the audio-visual video summarization. Furthermore, splitting video content on TVSum and COGNIMUSE datasets based on audio-visual CCA as positively and negatively correlated videos yields a strong performance improvement over the positively correlated videos for audio-only and audio-visual video summarization.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Video Summarization Overview
    Otani, Mayu
    Song, Yale
    Wang, Yang
    FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2022, 13 (04): : 284 - 335
  • [22] Video retrieval and summarization
    Sebe, N
    Lew, MS
    Smeulders, AWM
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2003, 92 (2-3) : 141 - 146
  • [23] AudioVisual Video Summarization
    Zhao, Bin
    Gong, Maoguo
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 5181 - 5188
  • [24] Video Co-summarization: Video Summarization by Visual Co-occurrence
    Chu, Wen-Sheng
    Song, Yale
    Jaimes, Alejandro
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3584 - 3592
  • [25] Audio based Soccer Game Summarization
    Duxans, Helenca
    Anguera, Xavier
    Conejero, David
    BMSB: 2009 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING, VOLS 1 AND 2, 2009, : 283 - 288
  • [26] Read, Watch, Listen, and Summarize: Multi-Modal Summarization for Asynchronous Text, Image, Audio and Video
    Li, Haoran
    Zhu, Junnan
    Ma, Cong
    Zhang, Jiajun
    Zong, Chengqing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (05) : 996 - 1009
  • [27] From video summarization to real time video summarization in smart cities and beyond: A survey
    Shambharkar, Prashant Giridhar
    Goel, Ruchi
    FRONTIERS IN BIG DATA, 2023, 5
  • [28] Audio/video
    Lightwave, 2000, 17 (04):
  • [29] Unsupervised video summarization using deep Non-Local video summarization networks
    Zang, Sha-Sha
    Yu, Hui
    Song, Yan
    Zeng, Ru
    NEUROCOMPUTING, 2023, 519 : 26 - 35
  • [30] EDGE-MOTION VIDEO SUMMARIZATION: ECONOMICAL VIDEO SUMMARIZATION FOR LOW POWERED DEVICES
    Anagnastopoulos, Vasileios
    Doulamis, Nikolaos
    Doulamis, Anastasios
    2009 10TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES, 2009, : 284 - +