FRAMEWORK FOR EVALUATION OF SOUND EVENT DETECTION IN WEB VIDEOS

被引:0
|
作者
Badlani, Rohan [2 ]
Shah, Ankit [1 ]
Elizalde, Benjamin [1 ]
Kumar, Anurag [1 ]
Raj, Bhiksha [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] BITS Pilani, Dept Comp Sci, Hyderabad, Telangana, India
关键词
Sound Event Detection; Convolutional Neural Network; Large-Scale audio event detection; Video Content Analysis;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The largest source of sound events is web videos. Most videos lack sound event labels at segment level, however, a significant number of them do respond to text queries, from a match found using metadata by search engines. In this paper we explore the extent to which a search query can be used as the true label for detection of sound events in videos. We present a framework for large-scale sound event recognition on web videos. The framework crawls videos using search queries corresponding to 78 sound event labels drawn from three datasets. The datasets are used to train three classifiers, and we obtain a prediction on 3.7 million web video segments. We evaluated performance using the search query as true label and compare it with human labeling. Both types of ground truth exhibited close performance, to within 10%, and similar performance trend with increasing number of evaluated segments. Hence, our experiments show potential for using search query as a preliminary true label for sound event recognition in web videos.
引用
收藏
页码:3096 / 3100
页数:5
相关论文
共 50 条
  • [21] A safety-oriented framework for sound event detection in driving scenarios
    Castorena, Carlos
    Cobos, Maximo
    Lopez-Ballester, Jesus
    Ferri, Francesc J.
    APPLIED ACOUSTICS, 2024, 215
  • [22] Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019
    Politis, Archontis
    Mesaros, Annamaria
    Adavanne, Sharath
    Heittola, Toni
    Virtanen, Tuomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 684 - 698
  • [23] Detection of goal event in soccer videos
    Kim, HG
    Roeber, S
    Samour, A
    Sikora, T
    STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2005, 2005, 5682 : 317 - 325
  • [24] Event detection in surveillance videos: a review
    Abdolamir Karbalaie
    Farhad Abtahi
    Mårten Sjöström
    Multimedia Tools and Applications, 2022, 81 : 35463 - 35501
  • [25] Event Detection of Broadcast Baseball Videos
    Hung, Mao-Hsiung
    Hsieh, Chaur-Heh
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008, 18 (12) : 1713 - 1726
  • [26] Event detection in surveillance videos: a review
    Karbalaie, Abdolamir
    Abtahi, Farhad
    Sjostrom, Marten
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 35463 - 35501
  • [27] DECK: Discovering Event Composition Knowledge from Web Images for Zero-Shot Event Detection and Recounting in Videos
    Gan, Chuang
    Sun, Chen
    Nevatia, Ram
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4032 - 4038
  • [28] Sangati - A Social Event Web approach to Index Videos
    Anilkumar, Anjana
    Sreenivasan, Anusha
    Sahay, Animesh
    Gurumurthy, Dilip
    Nirupama, M. P.
    Kalambur, Subramaniam
    Sitaram, Dinkar
    Jain, Ramesh
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 1574 - 1580
  • [29] Novel sound event and sound activity detection framework based on intrinsic mode functions and deep learning
    Vahid Hajihashemi
    Abdorreza Alavigharahbagh
    J. J. M. Machado
    João Manuel R. S. Tavares
    Multimedia Tools and Applications, 2025, 84 (14) : 13515 - 13543
  • [30] Sound Event Detection: A tutorial
    Mesaros, Annamaria
    Heittola, Toni
    Virtanen, Tuomas
    Plumbley, Mark D.
    IEEE SIGNAL PROCESSING MAGAZINE, 2021, 38 (05) : 67 - 83