Sound source localization using deep learning models

被引:84
|
作者
Yalta N. [1 ]
Nakadai K. [2 ]
Ogata T. [1 ,3 ]
机构
[1] Intermedia Art and Science Department, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo
[2] Honda Research Institute Japan Co., Ltd, Tokyo Institute of Technology, 8-1 Honcho, Wako, 351-0188, Saitama
[3] Faculty of Science and Engineering, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo
关键词
Deep learning; Deep residual networks; Sound source localization;
D O I
10.20965/jrm.2017.p0037
中图分类号
学科分类号
摘要
This study proposes the use of a deep neural network to localize a sound source using an array of microphones in a reverberant environment. During the last few years, applications based on deep neural networks have performed various tasks such as image classification or speech recognition to levels that exceed even human capabilities. In our study, we employ deep residual networks, which have recently shown remarkable performance in image classification tasks even when the training period is shorter than that of other models. Deep residual networks are used to process audio input similar to multiple signal classification (MUSIC) methods. We show that with end-to-end training and generic preprocessing, the performance of deep residual networks not only surpasses the block level accuracy of linear models on nearly clean environments but also shows robustness to challenging conditions by exploiting the time delay on power information. © 2017, Fuji Technology Press. All rights reserved.
引用
收藏
页码:37 / 48
页数:11
相关论文
共 50 条
  • [21] Sound source localization for auditory perception of a humanoid robot using deep neural networks
    Boztas, G.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (09): : 6801 - 6811
  • [22] UNSUPERVISED ADAPTATION OF DEEP NEURAL NETWORKS FOR SOUND SOURCE LOCALIZATION USING ENTROPY MINIMIZATION
    Takeda, Ryu
    Komatani, Kazunori
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2217 - 2221
  • [23] Sound source localization for auditory perception of a humanoid robot using deep neural networks
    G. Boztas
    Neural Computing and Applications, 2023, 35 : 6801 - 6811
  • [24] Binaural source localization using deep learning and head rotation information
    Garcia-Barrios, Guillermo
    Krause, Daniel Aleksander
    Politis, Archontis
    Mesaros, Annamaria
    Gutierrez-Arriola, Juana M.
    Fraile, Ruben
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 36 - 40
  • [25] Acoustic emission source localization in complex pipe structure using multi-task deep learning models
    Zhang, Tonghao
    Xu, Chenxi
    Ozevin, Didem
    ADVANCES IN STRUCTURAL ENGINEERING, 2025, 28 (01) : 23 - 37
  • [26] An Indoor Sound Source Localization Dataset for Machine Learning
    Wu, Tao
    Jiang, Yong
    Li, Nan
    Feng, Tao
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 28 - 32
  • [27] Sound Source Localization using Stochastic Computing
    Schober, Peter
    Estiri, Seyedeh Newsha
    Aygun, Sercan
    TaheriNejad, Nima
    Najafi, M. Hassan
    2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [28] Using sound source localization in a home environment
    Bian, XH
    Abowd, GD
    Rehg, JM
    PERVASIVE COMPUTING, PROCEEDINGS, 2005, 3468 : 19 - 36
  • [29] Indoor Sound Source Localization and Number Estimation Using Infinite Gaussian Mixture Models
    Sun, Longji
    Cheng, Qi
    CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 1189 - 1193
  • [30] Sound event localization and detection based on deep learning
    ZHAO Dada
    DING Kai
    QI Xiaogang
    CHEN Yu
    FENG Hailin
    Journal of Systems Engineering and Electronics, 2024, 35 (02) : 294 - 301