IFAN: An Icosahedral Feature Attention Network for Sound Source Localization

被引:8
|
作者
Zhu, Xin-Cheng [1 ]
Zhang, Hong [1 ]
Feng, Hui-Tao [1 ]
Zhao, Deng-Huang [1 ]
Zhang, Xiao-Jun [1 ]
Tao, Zhi [1 ]
机构
[1] Soochow Univ, Sch Optoelect Sci & Engn, Suzhou 215006, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Convolution; Array signal processing; Reverberation; Spectrogram; Location awareness; Feature extraction; Feature attention; icosahedral convolutional neural network (CNN); microphone arrays; sound source localization (SSL); steered response power with phase transform (SRP-PHAT); EVENT LOCALIZATION; ARRIVAL ESTIMATION; DOA ESTIMATION; DIFFERENCE;
D O I
10.1109/TIM.2023.3348907
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Currently, sound source localization (SSL) techniques based on deep learning mainly rely on traditional signal processing methods to generate input features. Nevertheless, the applicability of these features in various environments shows significant differences. This study proposes a new single SSL model, called the icosahedral feature attention network (IFAN), to overcome this limitation. The proposed IFAN not only uses steered response power with phase transform (SRP-PHAT) but also develops steered response power with least mean square (SRP-LMS) as inputs of the network. The IFAN network encodes spatial position information into convolution kernels by introducing icosahedral convolutions. In addition, it adaptively learns optimal feature weights based on the input acoustic environment using the sigmoid function to capture the spatial distribution information of the sound source. For single-source SSL and tracking scenarios, the proposed method on the localization and tracking (LOCATA) challenge data corpus outperforms other state-of-the-art models. Moreover, it is capable of learning complementary information even in acoustic simulations involving a wide range of reverberations. The proposed IFAN can thus enhance the robustness and performance in different environments.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [41] Attention mechanism combined with residual recurrent neural network for sound event detection and localization
    Chaofeng Lan
    Lei Zhang
    Yuanyuan Zhang
    Lirong Fu
    Chao Sun
    Yulan Han
    Meng Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [42] Attention mechanism combined with residual recurrent neural network for sound event detection and localization
    Lan, Chaofeng
    Zhang, Lei
    Zhang, Yuanyuan
    Fu, Lirong
    Sun, Chao
    Han, Yulan
    Zhang, Meng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [43] Influence of sound source width on human sound localization
    Greene, Nathaniel T.
    Paige, Gary D.
    2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 6455 - 6458
  • [44] Sound source localization using compressive sensing-based feature extraction and spatial sparsity
    Dehkordi, Mehdi Banitalebi
    Abutalebi, Hamid Reza
    Taban, Mohammad Reza
    DIGITAL SIGNAL PROCESSING, 2013, 23 (04) : 1239 - 1246
  • [45] Sound Source Localization for Mobile Robot Based on Time Difference Feature and Space Grid Matching
    Li, Xiaofei
    Liu, Hong
    Yang, Xuesong
    2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2011, : 2879 - 2886
  • [46] Sound Source Localization with Majorization Minimization
    Togami, Masahito
    Scheibler, Robin
    INTERSPEECH 2021, 2021, : 2122 - 2126
  • [47] Localization of a sound source in oceanic waveguides
    Besedina, T. N.
    Kuznetsov, G. N.
    Kuz'kin, V. M.
    Pereselkov, S. A.
    ACOUSTICAL PHYSICS, 2015, 61 (02) : 188 - 195
  • [48] A Review on Sound Source Localization Systems
    Desai, Dhwani
    Mehendale, Ninad
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (07) : 4631 - 4642
  • [49] Reliability measure for sound source localization
    Jeon, Hyejeong
    Kim, Seungil
    Kim, Lag-Yong
    Lee, Hee-Youn
    Yoon, Hyunsoo
    IEICE ELECTRONICS EXPRESS, 2008, 5 (06): : 192 - 197
  • [50] 'Eye array sound source localization
    Alghassi, Hedayat
    Tafazoli, Shahram
    Lawrence, Peter
    2006 FORTIETH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-5, 2006, : 2290 - +