Hierarchical Temporal Attention and Competent Teacher Network for Sound Event Detection

被引:0
|
作者
Zhang, Yihang [1 ]
Liang, Yun [1 ,2 ]
Weng, Shitong [1 ,3 ]
Lin, Hai [1 ]
Chen, Liping [1 ]
Zheng, Shenlong [1 ,4 ]
机构
[1] South China Agr Univ, Guangzhou, Peoples R China
[2] Guangzhou Key Lab Intelligent Agr, Guangzhou, Peoples R China
[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[4] Jinan Univ, Guangzhou, Guangdong, Peoples R China
关键词
sound event detection; hierarchical temporal attention; competent-teacher framework;
D O I
10.1109/ICME57554.2024.10688275
中图分类号
学科分类号
摘要
Sound event detection identifies specific auditory signal occurrences to recognize the sound event class and its temporal localization. While the Convolutional Recurrent Neural Network with a mean-teacher framework shows impressive SED performance, its effectiveness is hindered by its small receptive field, leading to inadequate consideration of global temporal information and imprecise event boundary localization. Moreover, existing detectors overlook the intricate interplay between temporal and frequency information, compromising detection accuracy. Simultaneously, there is an oversight in interactions between student and teacher models, leading to the teacher conveying inaccurate knowledge to the student. To solve these challenges, this paper proposes a novel robust detector named HTA-CTD, incorporating the Hierarchical Temporal Attention (HTA) and Competent Teacher Network (CTN). HTA introduces an adaptive temporal-frequency feature extraction method, while CTN minimizes reliance on strong labels. Experiments on challenging benchmarks show that our HTA-CTD outperforms the state-of-the-art detector and achieves leading performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Modeling Temporal Patterns of Cyberbullying Detection with Hierarchical Attention Networks
    Cheng, Lu
    Guo, Ruocheng
    Silva, Yasin N.
    Hall, Deborah
    Liu, Huan
    ACM/IMS Transactions on Data Science, 2021, 2 (02):
  • [32] Multiscale hierarchical attention fusion network for edge detection
    Meng, Kun
    Dong, Xianyong
    Shan, Hongyuan
    Xia, Shuyin
    INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2023, 42 (01) : 1 - 11
  • [33] Adaptive Hierarchical Pooling forWeakly-supervised Sound Event Detection
    Gao, Lijian
    Zhou, Ling
    Mao, Qirong
    Dong, Ming
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1779 - 1787
  • [34] WEAKLY-SUPERVISED SOUND EVENT DETECTION WITH SELF-ATTENTION
    Miyazaki, Koichi
    Komatsu, Tatsuya
    Hayashi, Tomoki
    Watanabe, Shinji
    Toda, Tomoki
    Takeda, Kazuya
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 66 - 70
  • [35] AN IMPROVED EVENT-INDEPENDENT NETWORK FOR POLYPHONIC SOUND EVENT LOCALIZATION AND DETECTION
    Gao, Yin
    Iqbal, Turab
    Kong, Qiuqiang
    An, Fengyan
    Wang, Wenwu
    Plumbley, Mark D.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 885 - 889
  • [36] Multi-stream Network With Temporal Attention For Environmental Sound Classification
    Li, Xinyu
    Chebiyyam, Venkata
    Kirchhoff, Katrin
    INTERSPEECH 2019, 2019, : 3604 - 3608
  • [37] Temporal Hierarchical Graph Attention Network for Traffic Prediction with Prompt Learning
    Li, Cheng
    Lai, Pei-Yuan
    Zhou, Yu-Xuan
    Wang, Chang-Dong
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 1460 - 1465
  • [38] TemporalHAN: Hierarchical attention-based heterogeneous temporal network embedding
    Mo, Xian
    Wan, Binyuan
    Tang, Rui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [39] A research for sound event localization and detection based on local-global adaptive fusion and temporal importance network
    Shi, Di
    Guo, Min
    Ma, Miao
    MULTIMEDIA SYSTEMS, 2024, 30 (06)
  • [40] Temporal convolution network with a dual attention mechanism for φ-OTDR event classification
    Tian, Manling
    Dong, Hui
    Cao, Xiaomin
    Yu, Kuanglu
    APPLIED OPTICS, 2022, 61 (20) : 5951 - 5956