Sound event localization and detection using element-wise attention gate and asymmetric convolutional recurrent neural networks

被引:0
|
作者
Yan, Lean [1 ]
Guo, Min [1 ]
Li, Zhiqiang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Minist Educ, Key Lab Modern Teaching Technol, Xian 710119, Peoples R China
基金
中国国家自然科学基金;
关键词
Sound event localization and detection; asymmetric convolution; context gating; squeeze excitation; element-wise attention gate;
D O I
10.3233/AIC-220125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are problems that standard square convolution kernel has insufficient representation ability and recurrent neural network usually ignores the importance of different elements within an input vector in sound event localization and detection. This paper proposes an element-wise attention gate-asymmetric convolutional recurrent neural network (EleAttG-ACRNN), to improve the performance of sound event localization and detection. First, a convolutional neural network with context gating and asymmetric squeeze excitation residual is constructed, where asymmetric convolution enhances the capability of the square convolution kernel; squeeze excitation can improve the interdependence between channels; context gating can weight the important features and suppress the irrelevant features. Next, in order to improve the expressiveness of the model, we integrate the element-wise attention gate into the bidirectional gated recurrent network, which is to highlight the importance of different elements within an input vector, and further learn the temporal context information. Evaluation results using the TAU Spatial Sound Events 2019-Ambisonic dataset show the effectiveness of the proposed method, and it improves SELD performance up to 0.05 in error rate, 1.7% in F-score, 0.7 degrees in DOA error, and 4.5% in Frame recall compared to a CRNN method.
引用
收藏
页码:147 / 157
页数:11
相关论文
共 50 条
  • [21] Weakly Labeled Semi-Supervised Sound Event Detection Based on Convolutional Independent Recurrent Neural Networks
    Dewang Changgeng Yu
    Xuanyu Yang
    Optical Memory and Neural Networks, 2022, 31 : 266 - 276
  • [22] Weakly Labeled Semi-Supervised Sound Event Detection Based on Convolutional Independent Recurrent Neural Networks
    Yu, Changgeng
    Yang, Dewang
    Liu, Xuanyu
    OPTICAL MEMORY AND NEURAL NETWORKS, 2022, 31 (03) : 266 - 276
  • [23] A Deep Learning Based Sound Event Location and Detection Algorithm Using Convolutional Recurrent Neural Network
    Zhu, Hongxiang
    Yan, Jun
    2022 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS, CITS, 2022, : 25 - 30
  • [24] Polyphonic sound event localization and detection using channel-wise FusionNet
    Spoorthy, V.
    Kooolagudi, Shashidhar G.
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5015 - 5026
  • [25] Hybrid Event Detection and Phase-Picking Algorithm Using Convolutional and Recurrent Neural Networks
    Zhou, Yijian
    Yue, Han
    Kong, Qingkai
    Zhou, Shiyong
    SEISMOLOGICAL RESEARCH LETTERS, 2019, 90 (03) : 1079 - 1087
  • [26] LOWLATENCY SOUND SOURCE SEPARATION USING CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Naithani, Gaurav
    Barker, Tom
    Parascandolo, Giambattista
    Bramslow, Lars
    Pontoppidan, Niels Henrik
    Virtanen, Tuomas
    2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 71 - 75
  • [27] Heart Sound Segmentation-An Event Detection Approach Using Deep Recurrent Neural Networks
    Messner, Elmar
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (09) : 1964 - 1974
  • [28] Diffusion-Based Convolutional Recurrent Neural Network for Improving Sound Event Detection
    Al Dabel, Maryam M.
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 8, ICICT 2024, 2024, 1004 : 173 - 183
  • [29] SOUND SOURCE LOCALIZATION IN A MULTIPATH ENVIRONMENT USING CONVOLUTIONAL NEURAL NETWORKS
    Ferguson, Eric L.
    Williams, Stefan B.
    Jin, Craig T.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2386 - 2390
  • [30] A Binaural Sound Localization System using Deep Convolutional Neural Networks
    Xu, Ying
    Afshar, Saeed
    Singh, Ram Kuber
    Wang, Runchun
    van Schaik, Andre
    Hamilton, Tara Julia
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,