Source Localization Using Distributed Microphones in Reverberant Environments Based on Deep Learning and Ray Space Transform

被引:19
|
作者
Comanducci, Luca [1 ]
Borra, Federico [1 ]
Bestagini, Paolo [1 ]
Antonacci, Fabio [1 ]
Tubaro, Stefano [1 ]
Sarti, Augusto [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, I-20133 Milan, Italy
关键词
Transforms; Training; Arrays; Microphone arrays; Reverberation; Acoustic source localization; deep learning; generalized cross correlation; ray space transform (RST); SOUND SOURCE LOCALIZATION; TIME; NETWORKS; NOISY;
D O I
10.1109/TASLP.2020.3011256
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this article we present a methodology for source localization in reverberant environments from Generalized Cross Correlations (GCCs) computed between spatially distributed individual microphones. Reverberation tends to negatively affect localization based on Time Differences of Arrival (TDOAs), which become inaccurate due to the presence of spurious peaks in the GCC. We therefore adopt a data-driven approach based on a convolutional neural network, which, using the GCCs as input, estimates the source location in two steps. It first computes the Ray Space Transform (RST) from multiple arrays. The RST is a convenient representation of the acoustic rays impinging on the array in a parametric space, called Ray Space. Rays produced by a source are visualized in the RST as patterns, whose position is uniquely related to the source location. The second step consists of estimating the source location through a nonlinear fitting, which estimates the coordinates that best approximate the RST pattern obtained through the first step. It is worth noting that training can be accomplished on simulated data only, thus relaxing the need of actually deploying microphone arrays in the acoustic scene. The localization accuracy of the proposed techniques is similar to the one of SRP-PHAT, however our method demonstrates an increased robustness regarding different distributed microphones configurations. Moreover, the use of the RST as an intermediate representation makes it possible for the network to generalize to data unseen during training.
引用
收藏
页码:2238 / 2251
页数:14
相关论文
共 50 条
  • [41] Sound Source Localization Using Deep Learning for Human-Robot Interaction Under Intelligent Robot Environments
    Jo, Hong-Min
    Kim, Tae-Wan
    Kwak, Keun-Chang
    ELECTRONICS, 2025, 14 (05):
  • [42] Fast and Accurate MEG Source Localization using Deep Learning
    Wang, Hanchen
    Feng, Shihang
    Zhang, Qian
    Kim, Young Jin
    Savukov, Igor
    Yang, Lan
    Lin, Youzuo
    MEDICAL IMAGING 2024: PHYSICS OF MEDICAL IMAGING, PT 1, 2024, 12925
  • [43] SSLIDE: SOUND SOURCE LOCALIZATION FOR INDOORS BASED ON DEEP LEARNING
    Wu, Yifan
    Ayyalasomayajula, Roshan
    Bianco, Michael J.
    Bharadia, Dinesh
    Gerstoft, Peter
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4680 - 4684
  • [44] A deep reinforcement learning based searching method for source localization
    Zhao, Yong
    Chen, Bin
    Wang, XiangHan
    Zhu, Zhengqiu
    Wang, Yiduo
    Cheng, Guangquan
    Wang, Rui
    Wang, Rongxiao
    He, Ming
    Liu, Yu
    INFORMATION SCIENCES, 2022, 588 : 67 - 81
  • [45] A deep reinforcement learning based searching method for source localization
    College of Systems Engineering, National University of Defense Technology, 109 Deya Road, Kaifu District, Changsha City
    Hunan Province, China
    不详
    不详
    Inf Sci, 2022, (67-81): : 67 - 81
  • [46] Distributed source DOA estimation based on deep learning networks
    Tian, Quan
    Cai, Ruiyan
    Qiu, Gongrun
    Luo, Yang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (10) : 7395 - 7403
  • [47] Blind source separation using spatially distributed microphones based on microphone-location dependent source activities
    Kinoshita, Keisuke
    Souden, Mehrez
    Nakatani, Tomohiro
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 822 - 826
  • [48] Source Localization Using Time Reversal in Urban Environments: A Ray Tracing Approach
    Bibb, Darcy A.
    Yun, Zhengqing
    Iskander, Magdy F.
    2014 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM (APSURSI), 2014, : 945 - 946
  • [49] Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning
    Padhy, Ram Prasad
    Ahmad, Shahzad
    Verma, Sachin
    Bakshi, Sambit
    Sa, Pankaj Kumar
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9423 - 9428
  • [50] Sound Source DOA Estimation and Localization in Noisy Reverberant Environments Using Least-Squares Support Vector Machines
    Chen, Huawei
    Ser, Wee
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2011, 63 (03): : 287 - 300