FRAMEWORK FOR EVALUATION OF SOUND EVENT DETECTION IN WEB VIDEOS

被引:0
|
作者
Badlani, Rohan [2 ]
Shah, Ankit [1 ]
Elizalde, Benjamin [1 ]
Kumar, Anurag [1 ]
Raj, Bhiksha [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] BITS Pilani, Dept Comp Sci, Hyderabad, Telangana, India
关键词
Sound Event Detection; Convolutional Neural Network; Large-Scale audio event detection; Video Content Analysis;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The largest source of sound events is web videos. Most videos lack sound event labels at segment level, however, a significant number of them do respond to text queries, from a match found using metadata by search engines. In this paper we explore the extent to which a search query can be used as the true label for detection of sound events in videos. We present a framework for large-scale sound event recognition on web videos. The framework crawls videos using search queries corresponding to 78 sound event labels drawn from three datasets. The datasets are used to train three classifiers, and we obtain a prediction on 3.7 million web video segments. We evaluated performance using the search query as true label and compare it with human labeling. Both types of ground truth exhibited close performance, to within 10%, and similar performance trend with increasing number of evaluated segments. Hence, our experiments show potential for using search query as a preliminary true label for sound event recognition in web videos.
引用
收藏
页码:3096 / 3100
页数:5
相关论文
共 50 条
  • [41] Online Aggregated-Event Representation for Multiple Event Detection in Videos
    Mleya, Molefe Vicky
    Li, Weiqi
    Liang, Jiayu
    Liu, Kunliang
    Sun, Yunkuan
    Jin, Guanghao
    Wang, Jianming
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 501 - 515
  • [42] Metrics for Polyphonic Sound Event Detection
    Mesaros, Annamaria
    Heittola, Toni
    Virtanen, Tuomas
    APPLIED SCIENCES-BASEL, 2016, 6 (06):
  • [43] Improving sound event detection with ontologies
    Raj, Bhiksha
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [44] A Mobile Application for Sound Event Detection
    Fu, Yingwei
    Xu, Kele
    Mi, Haibo
    Wang, Huaimin
    Wang, Dezhi
    Zhu, Boqing
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 6515 - 6517
  • [45] Active Learning for Sound Event Detection
    Shuyang Zhao
    Heittola, Toni
    Virtanen, Tuomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2895 - 2905
  • [46] Capsule Routing for Sound Event Detection
    Iqbal, Turab
    Xu, Yong
    Kong, Qiuqiang
    Wang, Wenwu
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2255 - 2259
  • [47] Evaluation of Low-Level Features and their Combinations for Complex Event Detection in Open Source Videos
    Tamrakar, Amir
    Ali, Saad
    Yu, Qian
    Liu, Jingen
    Javed, Omar
    Divakaran, Ajay
    Cheng, Hui
    Sawhney, Harpreet
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 3681 - 3688
  • [48] Sound-Event Partitioning and Feature Normalization for Robust Sound-Event Detection
    Lei, Baiying
    Mak, Man-Wai
    2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 389 - 394
  • [49] Scene-based event detection for baseball videos
    Lien, Cheng-Chang
    Chiang, Chiu-Lung
    Lee, Chang-Hsing
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2007, 18 (01) : 1 - 14
  • [50] Abnormal Event Detection in Videos using Binary Features
    Leyva, Roberto
    Sanchez, Victor
    Li, Chang-Tsun
    2017 40TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2017, : 621 - 625