Source Localization Using Distributed Microphones in Reverberant Environments Based on Deep Learning and Ray Space Transform

被引:19
|
作者
Comanducci, Luca [1 ]
Borra, Federico [1 ]
Bestagini, Paolo [1 ]
Antonacci, Fabio [1 ]
Tubaro, Stefano [1 ]
Sarti, Augusto [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, I-20133 Milan, Italy
关键词
Transforms; Training; Arrays; Microphone arrays; Reverberation; Acoustic source localization; deep learning; generalized cross correlation; ray space transform (RST); SOUND SOURCE LOCALIZATION; TIME; NETWORKS; NOISY;
D O I
10.1109/TASLP.2020.3011256
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this article we present a methodology for source localization in reverberant environments from Generalized Cross Correlations (GCCs) computed between spatially distributed individual microphones. Reverberation tends to negatively affect localization based on Time Differences of Arrival (TDOAs), which become inaccurate due to the presence of spurious peaks in the GCC. We therefore adopt a data-driven approach based on a convolutional neural network, which, using the GCCs as input, estimates the source location in two steps. It first computes the Ray Space Transform (RST) from multiple arrays. The RST is a convenient representation of the acoustic rays impinging on the array in a parametric space, called Ray Space. Rays produced by a source are visualized in the RST as patterns, whose position is uniquely related to the source location. The second step consists of estimating the source location through a nonlinear fitting, which estimates the coordinates that best approximate the RST pattern obtained through the first step. It is worth noting that training can be accomplished on simulated data only, thus relaxing the need of actually deploying microphone arrays in the acoustic scene. The localization accuracy of the proposed techniques is similar to the one of SRP-PHAT, however our method demonstrates an increased robustness regarding different distributed microphones configurations. Moreover, the use of the RST as an intermediate representation makes it possible for the network to generalize to data unseen during training.
引用
收藏
页码:2238 / 2251
页数:14
相关论文
共 50 条
  • [21] SOUND SOURCE LOCALIZATION IN A REVERBERANT ROOM USING HARMONIC BASED MUSIC
    Birnie, Lachlan
    Abhayapala, Thushara D.
    Chen, Hanchi
    Samarasinghe, Prasanga N.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 651 - 655
  • [22] Fuzzy based Algorithm for Acoustic Source Localization using Array of Microphones
    Faraji, Mohammad Mahdi
    Shouraki, Saeed Bagheri
    Iranmehr, Ensieh
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 2102 - 2105
  • [23] Noise source localization using deep learning
    Zhou, Jie
    Mi, Binbin
    Xia, Jianghai
    Zhang, Hao
    Liu, Ya
    Chen, Xinhua
    Guan, Bo
    Hong, Yu
    Ma, Yulong
    GEOPHYSICAL JOURNAL INTERNATIONAL, 2024, 238 (01) : 513 - 536
  • [24] Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments
    Zhang, Yu
    Jia, Maoshen
    Gao, Shang
    Wang, Jing
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (08) : 4713 - 4739
  • [25] Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments
    Yu Zhang
    Maoshen Jia
    Shang Gao
    Jing Wang
    Circuits, Systems, and Signal Processing, 2023, 42 : 4713 - 4739
  • [26] Binaural Sound Source Localization in Noisy Reverberant Environments Based on Equalization-Cancellation Theory
    Thanh-Duc Chau
    Li, Junfeng
    Akagi, Masato
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2014, E97A (10) : 2011 - 2020
  • [27] Sound Source Localization Based on GCC-PHAT With Diffuseness Mask in Noisy and Reverberant Environments
    Lee, Ran
    Kang, Min-Seok
    Kim, Bo-Hyun
    Park, Kang-Ho
    Lee, Sung Q.
    Park, Hyung-Min
    IEEE ACCESS, 2020, 8 : 7373 - 7382
  • [28] Deep Learning Based Multi-Channel Speaker Recognition in Noisy and Reverberant Environments
    Taherian, Hassan
    Wang, Zhong-Qiu
    Wane, DeLiang
    INTERSPEECH 2019, 2019, : 4070 - 4074
  • [29] Recovering speech intelligibility with deep learning and multiple microphones in noisy-reverberant situations for people using cochlear implants
    Gaultier, Clement
    Goehring, Tobias
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 155 (06): : 3833 - 3847
  • [30] Recovering speech intelligibility with deep learning and multiple microphones in noisy-reverberant situations for people using cochlear implants
    Cambridge Hearing Group, Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge
    CB2 7EF, United Kingdom
    J. Acoust. Soc. Am., 2024, 6 (3833-3847):