Source Localization Using Distributed Microphones in Reverberant Environments Based on Deep Learning and Ray Space Transform

被引：19

作者：

Comanducci, Luca ^{[1
]}

Borra, Federico ^{[1
]}

Bestagini, Paolo ^{[1
]}

Antonacci, Fabio ^{[1
]}

Tubaro, Stefano ^{[1
]}

Sarti, Augusto ^{[1
]}

机构：

[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, I-20133 Milan, Italy

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2020年 / 28卷

关键词：

Transforms; Training; Arrays; Microphone arrays; Reverberation; Acoustic source localization; deep learning; generalized cross correlation; ray space transform (RST); SOUND SOURCE LOCALIZATION; TIME; NETWORKS; NOISY;

D O I：

10.1109/TASLP.2020.3011256

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this article we present a methodology for source localization in reverberant environments from Generalized Cross Correlations (GCCs) computed between spatially distributed individual microphones. Reverberation tends to negatively affect localization based on Time Differences of Arrival (TDOAs), which become inaccurate due to the presence of spurious peaks in the GCC. We therefore adopt a data-driven approach based on a convolutional neural network, which, using the GCCs as input, estimates the source location in two steps. It first computes the Ray Space Transform (RST) from multiple arrays. The RST is a convenient representation of the acoustic rays impinging on the array in a parametric space, called Ray Space. Rays produced by a source are visualized in the RST as patterns, whose position is uniquely related to the source location. The second step consists of estimating the source location through a nonlinear fitting, which estimates the coordinates that best approximate the RST pattern obtained through the first step. It is worth noting that training can be accomplished on simulated data only, thus relaxing the need of actually deploying microphone arrays in the acoustic scene. The localization accuracy of the proposed techniques is similar to the one of SRP-PHAT, however our method demonstrates an increased robustness regarding different distributed microphones configurations. Moreover, the use of the RST as an intermediate representation makes it possible for the network to generalize to data unseen during training.

引用

页码：2238 / 2251

页数：14

共 50 条

[41] Sound Source Localization Using Deep Learning for Human-Robot Interaction Under Intelligent Robot Environments
Jo, Hong-Min
Kim, Tae-Wan
Kwak, Keun-Chang
ELECTRONICS, 2025, 14 (05):
[42] Fast and Accurate MEG Source Localization using Deep Learning
Wang, Hanchen
Feng, Shihang
Zhang, Qian
Kim, Young Jin
Savukov, Igor
Yang, Lan
Lin, Youzuo
MEDICAL IMAGING 2024: PHYSICS OF MEDICAL IMAGING, PT 1, 2024, 12925
[43] SSLIDE: SOUND SOURCE LOCALIZATION FOR INDOORS BASED ON DEEP LEARNING
Wu, Yifan
Ayyalasomayajula, Roshan
Bianco, Michael J.
Bharadia, Dinesh
Gerstoft, Peter
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4680 - 4684
[44] A deep reinforcement learning based searching method for source localization
Zhao, Yong
Chen, Bin
Wang, XiangHan
Zhu, Zhengqiu
Wang, Yiduo
Cheng, Guangquan
Wang, Rui
Wang, Rongxiao
He, Ming
Liu, Yu
INFORMATION SCIENCES, 2022, 588 : 67 - 81
[45] A deep reinforcement learning based searching method for source localization
College of Systems Engineering, National University of Defense Technology, 109 Deya Road, Kaifu District, Changsha City
Hunan Province, China
不详
不详
Inf Sci, 2022, (67-81): : 67 - 81
[46] Distributed source DOA estimation based on deep learning networks
Tian, Quan
Cai, Ruiyan
Qiu, Gongrun
Luo, Yang
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (10) : 7395 - 7403
[47] Blind source separation using spatially distributed microphones based on microphone-location dependent source activities
Kinoshita, Keisuke
Souden, Mehrez
Nakatani, Tomohiro
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 822 - 826
[48] Source Localization Using Time Reversal in Urban Environments: A Ray Tracing Approach
Bibb, Darcy A.
Yun, Zhengqing
Iskander, Magdy F.
2014 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM (APSURSI), 2014, : 945 - 946
[49] Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning
Padhy, Ram Prasad
Ahmad, Shahzad
Verma, Sachin
Bakshi, Sambit
Sa, Pankaj Kumar
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9423 - 9428
[50] Sound Source DOA Estimation and Localization in Noisy Reverberant Environments Using Least-Squares Support Vector Machines
Chen, Huawei
Ser, Wee
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2011, 63 (03): : 287 - 300

← 1 2 3 4 5 →