Source Localization Using Distributed Microphones in Reverberant Environments Based on Deep Learning and Ray Space Transform

被引:19
|
作者
Comanducci, Luca [1 ]
Borra, Federico [1 ]
Bestagini, Paolo [1 ]
Antonacci, Fabio [1 ]
Tubaro, Stefano [1 ]
Sarti, Augusto [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, I-20133 Milan, Italy
关键词
Transforms; Training; Arrays; Microphone arrays; Reverberation; Acoustic source localization; deep learning; generalized cross correlation; ray space transform (RST); SOUND SOURCE LOCALIZATION; TIME; NETWORKS; NOISY;
D O I
10.1109/TASLP.2020.3011256
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this article we present a methodology for source localization in reverberant environments from Generalized Cross Correlations (GCCs) computed between spatially distributed individual microphones. Reverberation tends to negatively affect localization based on Time Differences of Arrival (TDOAs), which become inaccurate due to the presence of spurious peaks in the GCC. We therefore adopt a data-driven approach based on a convolutional neural network, which, using the GCCs as input, estimates the source location in two steps. It first computes the Ray Space Transform (RST) from multiple arrays. The RST is a convenient representation of the acoustic rays impinging on the array in a parametric space, called Ray Space. Rays produced by a source are visualized in the RST as patterns, whose position is uniquely related to the source location. The second step consists of estimating the source location through a nonlinear fitting, which estimates the coordinates that best approximate the RST pattern obtained through the first step. It is worth noting that training can be accomplished on simulated data only, thus relaxing the need of actually deploying microphone arrays in the acoustic scene. The localization accuracy of the proposed techniques is similar to the one of SRP-PHAT, however our method demonstrates an increased robustness regarding different distributed microphones configurations. Moreover, the use of the RST as an intermediate representation makes it possible for the network to generalize to data unseen during training.
引用
收藏
页码:2238 / 2251
页数:14
相关论文
共 50 条
  • [31] Sub-sampling-based 2D localization of an impulsive acoustic source in reverberant environments
    Omer, Muhammad
    Quadeer, Ahmed A.
    Sharawi, Mohammad S.
    Al-Naffouri, Tareq Y.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2014,
  • [32] Adaptive Time Delay Estimation Using Filter Length Constraints for Source Localization in Reverberant Acoustic Environments
    Salvati, Daniele
    Canazza, Sergio
    IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (05) : 507 - 510
  • [33] Sub-sampling-based 2D localization of an impulsive acoustic source in reverberant environments
    Muhammad Omer
    Ahmed A Quadeer
    Mohammad S Sharawi
    Tareq Y Al-Naffouri
    EURASIP Journal on Advances in Signal Processing, 2014
  • [34] Sound source localization using deep learning models
    Yalta N.
    Nakadai K.
    Ogata T.
    2017, Fuji Technology Press (29) : 37 - 48
  • [35] Reconstruction of the Virtual Microphone Signal Based on the Distributed Ray Space Transform
    Pezzoli, Mirco
    Borra, Federico
    Antonacci, Fabio
    Sarti, Augusto
    Tubaro, Stefano
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1537 - 1541
  • [36] Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization
    Fakhry, Mahmoud
    Svaizer, Piergiorgio
    Omologo, Maurizio
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1462 - 1476
  • [37] Two-Stage Monaural Source Separation in Reverberant Room Environments Using Deep Neural Networks
    Sun, Yang
    Wang, Wenwu
    Chambers, Jonathon
    Naqvi, Syed Mohsen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 125 - 139
  • [38] Semi-Supervised Multiple Source Localization Using Relative Harmonic Coefficients Under Noisy and Reverberant Environments
    Hu, Yonggang
    Samarasinghe, Prasanga N.
    Gannot, Sharon
    Abhayapala, Thushara D.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 3108 - 3123
  • [39] HeterPS: Distributed deep learning with reinforcement learning based scheduling in heterogeneous environments
    Liu, Ji
    Wu, Zhihua
    Feng, Danlei
    Zhang, Minxu
    Wu, Xinxuan
    Yao, Xuefeng
    Yu, Dianhai
    Ma, Yanjun
    Zhao, Feng
    Dou, Dejing
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 148 : 106 - 117
  • [40] Multiple source localization using learning-based sparse estimation in deep ocean
    Liu, Yining
    Niu, Haiqiang
    Yang, Sisi
    Li, Zhenglin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 150 (05): : 3773 - 3786