ROLE OF AUDIO IN VIDEO SUMMARIZATION

被引:0
|
作者
Shoer, Ibrahim [1 ]
Kopru, Berkay [1 ]
Erzin, Engin [1 ]
机构
[1] Koc Univ, Coll Engn, Multimedia Vis & Graph Grp, KUIS AI Lab, Istanbul, Turkiye
关键词
Audio-visual video summarization; canonical correlation analysis;
D O I
10.1109/ICASSPW59220.2023.10192578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Video summarization attracts attention for efficient video representation, retrieval, and browsing to ease volume and traffic surge problems. Although video summarization mostly uses the visual channel for compaction, the benefits of audio-visual modeling appeared in recent literature. The information coming from the audio channel can be a result of audio-visual correlation in the video content. In this study, we propose a new audio-visual video summarization framework integrating four ways of audio-visual information fusion with GRU-based and attention-based networks. Furthermore, we investigate a new explainability methodology using audio-visual canonical correlation analysis (CCA) to better understand and explain the role of audio in the video summarization task. Experimental evaluations on the TVSum dataset attain F1 score and Kendall-tau score improvements for the audio-visual video summarization. Furthermore, splitting video content on TVSum and COGNIMUSE datasets based on audio-visual CCA as positively and negatively correlated videos yields a strong performance improvement over the positively correlated videos for audio-only and audio-visual video summarization.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] REPRESENTATIVE AND DIVERSE VIDEO SUMMARIZATION
    Chen, Xiao
    Li, Xuelong
    Lui, Xiaoqiang
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 142 - 146
  • [42] MINMAX optimal video summarization
    Li, Z
    Schuster, GM
    Katsaggelos, AK
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2005, 15 (10) : 1245 - 1256
  • [43] DHMDL: Dynamically Hashed Multimodal Deep Learning Framework for Racket Video Summarization Using Audio and Visual Markers
    Priyanka, G.
    Kumar, J. Senthil
    Meena, M. Prasha
    APPLIED ARTIFICIAL INTELLIGENCE, 2025, 39 (01)
  • [44] In-Source Video Summarization
    Ramasubramaniam, Kousik Sankar
    Annamalai, Ganesankumar
    2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2016,
  • [45] Memorable and rich video summarization
    Fei, Mengjuan
    Jiang, Wei
    Mao, Weijie
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 42 : 207 - 217
  • [46] Efficient Bronchoscopic Video Summarization
    Byrnes, Patrick D.
    Higgins, William Evan
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2019, 66 (03) : 848 - 863
  • [47] Efficient Transformer for Video Summarization
    Kolmakova, Tatiana
    Makarov, Ilya
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT II, 2023, 14135 : 52 - 65
  • [48] Sociometry based multiparty audio recordings summarization
    Vinciarelli, Alessandro
    18th International Conference on Pattern Recognition, Vol 2, Proceedings, 2006, : 1154 - 1157
  • [49] A SYSTEM FOR AUDIO SUMMARIZATION IN ACOUSTIC MONITORING SCENARIOS
    Damm, David
    von Zeddelmann, Dirk
    Oispuu, Marc
    Haege, Miriam
    Kurth, Frank
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1279 - 1283
  • [50] Video Summarization by Group Scoring
    Darabi, Kaveh
    Ghinea, Gheorghita
    2014 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2014, : 112 - 116