DURATION ROBUST WEAKLY SUPERVISED SOUND EVENT DETECTION

被引:0
|
作者
Dinkel, Heinrich [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, SpeechLab, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
关键词
weakly supervised sound event detection; convolutional neural networks; recurrent neural networks; semi-supervised duration estimation;
D O I
10.1109/icassp40776.2020.9053459
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Task 4 of the DCASE2018 challenge demonstrated that substantially more research is needed for a real-world application of sound event detection. Analyzing the challenge results it can be seen that most successful models are biased towards predicting long (e.g., over 5s) clips. This work aims to investigate the performance impact of fixed-sized window median filter post-processing and advocate the use of double thresholding as a more robust and predictable post-processing method. Further, four different temporal subsampling methods within the CRNN framework are proposed: mean-max, ff-mean-max, Lp-norm and convolutional. We show that for this task subsampling the temporal resolution by a neural network enhances the F1 score as well as its robustness towards short, sporadic sound events. Our best single model achieves 30.1% F1 on the evaluation set and the best fusion model 32:5%, while being robust to event length variations.
引用
收藏
页码:311 / 315
页数:5
相关论文
共 50 条
  • [31] Improved capsule routing for weakly labeled sound event detection
    Haitao Li
    Shuguo Yang
    Wenwu Wang
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [32] Improved capsule routing for weakly labeled sound event detection
    Li, Haitao
    Yang, Shuguo
    Wang, Wenwu
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [33] Hierarchical Pooling Structure for Weakly Labeled Sound Event Detection
    He, Ke-Xin
    Shen, Yu-Han
    Zhang, Wei-Qiang
    INTERSPEECH 2019, 2019, : 3624 - 3628
  • [34] Adaptive Pooling Operators for Weakly Labeled Sound Event Detection
    McFee, Brian
    Salamon, Justin
    Bello, Juan Pablo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 2180 - 2193
  • [35] AN IMPROVED MEAN TEACHER BASED METHOD FOR LARGE SCALE WEAKLY LABELED SEMI-SUPERVISED SOUND EVENT DETECTION
    Zheng, Xu
    Song, Yan
    McLoughlin, Ian
    Liu, Lin
    Dai, Li-Rong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 356 - 360
  • [36] Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection
    Liu, Yuzhuo
    Chen, Hangting
    Zhao, Qingwei
    Zhang, Pengyuan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (04) : 828 - 831
  • [37] Minimally Supervised Sound Event Detection Using a Neural Network
    Agarwal, Aditya
    Quadri, Syed Munawwar
    Murthy, Savitha
    Sitaram, Dinkar
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2495 - 2500
  • [38] OVERLAPPING SOUND EVENT DETECTION WITH SUPERVISED NONNEGATIVE MATRIX FACTORIZATION
    Bisot, Victor
    Essid, Slim
    Richard, Gael
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 31 - 35
  • [39] COUPLE LEARNING FOR SEMI-SUPERVISED SOUND EVENT DETECTION
    Tao, Rui
    Yan, Long
    Ouchi, Kazushige
    Wang, Xiangdong
    INTERSPEECH 2022, 2022, : 2398 - 2402
  • [40] Weak Supervised Sound Event Detection Based on Puzzle CAM
    Cai, Xichang
    Gan, Yanggang
    Wu, Menglong
    Wu, Juan
    IEEE ACCESS, 2023, 11 : 89290 - 89297