DURATION ROBUST WEAKLY SUPERVISED SOUND EVENT DETECTION

被引:0
|
作者
Dinkel, Heinrich [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, SpeechLab, Dept Comp Sci & Engn, MoE Key Lab Artificial Intelligence, Shanghai, Peoples R China
关键词
weakly supervised sound event detection; convolutional neural networks; recurrent neural networks; semi-supervised duration estimation;
D O I
10.1109/icassp40776.2020.9053459
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Task 4 of the DCASE2018 challenge demonstrated that substantially more research is needed for a real-world application of sound event detection. Analyzing the challenge results it can be seen that most successful models are biased towards predicting long (e.g., over 5s) clips. This work aims to investigate the performance impact of fixed-sized window median filter post-processing and advocate the use of double thresholding as a more robust and predictable post-processing method. Further, four different temporal subsampling methods within the CRNN framework are proposed: mean-max, ff-mean-max, Lp-norm and convolutional. We show that for this task subsampling the temporal resolution by a neural network enhances the F1 score as well as its robustness towards short, sporadic sound events. Our best single model achieves 30.1% F1 on the evaluation set and the best fusion model 32:5%, while being robust to event length variations.
引用
收藏
页码:311 / 315
页数:5
相关论文
共 50 条
  • [1] Towards Duration Robust Weakly Supervised Sound Event Detection
    Dinkel, Heinrich
    Wu, Mengyue
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 887 - 900
  • [2] AFFINITY MIXUP FOR WEAKLY SUPERVISED SOUND EVENT DETECTION
    Izadi, Mohammad Rasool
    Stevenson, Robert
    Kloepper, Laura
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [3] Voice activity detection in the wild via weakly supervised sound event detection
    Chen, Yefei
    Dinkel, Heinrich
    Wu, Mengyue
    Yu, Kai
    INTERSPEECH 2020, 2020, : 3665 - 3669
  • [4] Improving weakly supervised sound event detection with self-supervised auxiliary tasks
    Deshmukh, Soham
    Raj, Bhiksha
    Singh, Rita
    INTERSPEECH 2021, 2021, : 596 - 600
  • [5] WEAKLY-SUPERVISED SOUND EVENT DETECTION WITH SELF-ATTENTION
    Miyazaki, Koichi
    Komatsu, Tatsuya
    Hayashi, Tomoki
    Watanabe, Shinji
    Toda, Tomoki
    Takeda, Kazuya
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 66 - 70
  • [6] Background-aware Modeling for Weakly Supervised Sound Event Detection
    Xin, Yifei
    Yang, Dongchao
    Zou, Yuexian
    INTERSPEECH 2023, 2023, : 1199 - 1203
  • [7] JOINT ACOUSTIC AND CLASS INFERENCE FOR WEAKLY SUPERVISED SOUND EVENT DETECTION
    Kothinti, Sandeep
    Imoto, Keisuke
    Chakrabarty, Debmalya
    Sell, Gregory
    Watanabe, Shinji
    Elhilali, Mounya
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 36 - 40
  • [8] A FRAME LOSS OF MULTIPLE INSTANCE LEARNING FOR WEAKLY SUPERVISED SOUND EVENT DETECTION
    Wang, Xu
    Zhang, Xiangjinzi
    Zi, Yunfei
    Xiong, Shengwu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 331 - 335
  • [9] A REGION BASED ATTENTION METHOD FOR WEAKLY SUPERVISED SOUND EVENT DETECTION AND CLASSIFICATION
    Yan, Jie
    Song, Yan
    Guo, Wu
    Dai, Li-Rong
    McLoughlin, Ian
    Chen, Liang
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 755 - 759
  • [10] Weakly Supervised U-Net with Limited Upsampling for Sound Event Detection
    Lee, Sangwon
    Kim, Hyemi
    Jang, Gil-Jin
    APPLIED SCIENCES-BASEL, 2023, 13 (11):