IFAN: An Icosahedral Feature Attention Network for Sound Source Localization

被引:8
|
作者
Zhu, Xin-Cheng [1 ]
Zhang, Hong [1 ]
Feng, Hui-Tao [1 ]
Zhao, Deng-Huang [1 ]
Zhang, Xiao-Jun [1 ]
Tao, Zhi [1 ]
机构
[1] Soochow Univ, Sch Optoelect Sci & Engn, Suzhou 215006, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Convolution; Array signal processing; Reverberation; Spectrogram; Location awareness; Feature extraction; Feature attention; icosahedral convolutional neural network (CNN); microphone arrays; sound source localization (SSL); steered response power with phase transform (SRP-PHAT); EVENT LOCALIZATION; ARRIVAL ESTIMATION; DOA ESTIMATION; DIFFERENCE;
D O I
10.1109/TIM.2023.3348907
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Currently, sound source localization (SSL) techniques based on deep learning mainly rely on traditional signal processing methods to generate input features. Nevertheless, the applicability of these features in various environments shows significant differences. This study proposes a new single SSL model, called the icosahedral feature attention network (IFAN), to overcome this limitation. The proposed IFAN not only uses steered response power with phase transform (SRP-PHAT) but also develops steered response power with least mean square (SRP-LMS) as inputs of the network. The IFAN network encodes spatial position information into convolution kernels by introducing icosahedral convolutions. In addition, it adaptively learns optimal feature weights based on the input acoustic environment using the sigmoid function to capture the spatial distribution information of the sound source. For single-source SSL and tracking scenarios, the proposed method on the localization and tracking (LOCATA) challenge data corpus outperforms other state-of-the-art models. Moreover, it is capable of learning complementary information even in acoustic simulations involving a wide range of reverberations. The proposed IFAN can thus enhance the robustness and performance in different environments.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [1] A Study of Icosahedral Feature Sound Source Localization Method with Hybrid Dilation Convolution
    Qiu, Zhi-xin
    Niu, Jian-wei
    Shen, Zhan-xu
    Yang, Yu-xuan
    Zhang, Shu-ju
    Zhang, Xiao-jun
    Tao, Zhi
    IAENG International Journal of Computer Science, 2024, 51 (12) : 1950 - 1959
  • [2] Sound source localization based on residual network and channel attention module
    Fucai Hu
    Xiaohui Song
    Ruhan He
    Yongsheng Yu
    Scientific Reports, 13
  • [3] Sound source localization based on residual network and channel attention module
    Hu, Fucai
    Song, Xiaohui
    He, Ruhan
    Yu, Yongsheng
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [4] Sound source localization and detection based on densely connected network and attention mechanism
    Zhou, Bomao
    Tang, Jin
    APPLIED ACOUSTICS, 2025, 228
  • [5] FA3-Net: feature aggregation and augmentation with attention network for sound event localization and detection
    Wang, Chuan
    Huang, Qinghua
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [6] Indoor Sound Source Localization With Probabilistic Neural Network
    Sun, Yingxiang
    Chen, Jiajia
    Yuen, Chau
    Rahardja, Susanto
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2018, 65 (08) : 6403 - 6413
  • [7] Audio-Visual Fusion for Sound Source Localization and Improved Attention
    Lee, Byoung-gi
    Choi, JongSuk
    Yoon, SangSuk
    Choi, Mun-Taek
    Kim, Munsang
    Kim, Daijin
    TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS A, 2011, 35 (07) : 737 - 743
  • [8] A generalized network based on multi-scale densely connection and residual attention for sound source localization and detection
    Hu, Ying
    Sun, Xinghao
    He, Liang
    Huang, Hao
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (03): : 1754 - 1768
  • [9] SPATIAL FEATURE LEARNING FOR ROBUST BINAURAL SOUND SOURCE LOCALIZATION USING A COMPOSITE FEATURE VECTOR
    Wu, Xiang
    Talagala, Dumidu S.
    Zhang, Wen
    Abhayapala, Thushara D.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6320 - 6324
  • [10] Sound source localization
    Risoud, M.
    Hanson, J. -N.
    Gauvrit, F.
    Renard, C.
    Lemesre, P. -E.
    Bonne, N. -X.
    Vincent, C.
    EUROPEAN ANNALS OF OTORHINOLARYNGOLOGY-HEAD AND NECK DISEASES, 2018, 135 (04) : 259 - 264