DeepEar: Sound Localization With Binaural Microphones

被引:7
|
作者
Yang, Qiang [1 ]
Zheng, Yuanqing [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
关键词
Binaural localization; multi-source localization; earable computing; NEURAL-NETWORKS; HEAD MOVEMENTS; NOISE; DIFFERENCE; FEATURES; SEARCH;
D O I
10.1109/TMC.2022.3222821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The binaural microphone, which refers to a pair of microphones with artificial human-shaped ears, is widely used in hearing aids and spatial audio recording to improve sound quality. It is crucial for such devices to find the voice direction in many applications such as binaural sound enhancement. However, sound localization with two microphones remains challenging, especially in multi-source scenarios. Most previous work utilized microphone arrays to deal with the multi-source localization problem. Extra microphones yet have space constraints for deployment in many scenarios (e.g., hearing aids). Inspired by the fact that humans have evolved to locate multiple sound sources with only two ears, we propose DeepEar, a binaural microphone-based sound localization system. To this end, we design a multisector-based neural network to locate multiple sound sources simultaneously, where each sector is a discretized region of the space for different angle of arrivals. DeepEar fuses explicit hand-crafted features and implicit latent sound representatives to facilitate sound localization. More importantly, the trained DeepEar model can adapt to new environments with a minimum amount of extra training data. The experiment results show that DeepEar substantially outperforms the state-of-the-art binaural deep learning approach by a large margin in terms of sound detection accuracy and azimuth estimation error.
引用
收藏
页码:359 / 375
页数:17
相关论文
共 50 条
  • [41] BINAURAL DISPARITY CUES AVAILABLE TO THE BARN OWL FOR SOUND LOCALIZATION
    MOISEFF, A
    JOURNAL OF COMPARATIVE PHYSIOLOGY A-SENSORY NEURAL AND BEHAVIORAL PHYSIOLOGY, 1989, 164 (05): : 629 - 636
  • [42] A Learning-Based Approach to Robust Binaural Sound Localization
    Youssef, Karim
    Argentieri, Sylvain
    Zarader, Jean-Luc
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 2927 - 2932
  • [43] Robotic Binaural Localization and Separation of Multiple Simultaneous Sound Sources
    Keyrouz, Fakheredine
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, : 188 - 195
  • [44] ROBUST FULL-SPHERE BINAURAL SOUND SOURCE LOCALIZATION
    Hammond, Benjamin R.
    Jackson, Philip J. B.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 86 - 90
  • [45] Binaural ambiguity amplifies visual bias in sound source localization
    Zhou, Yi
    Balderas, Leslie
    Venskytis, Emily Jo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 144 (06): : 3118 - 3123
  • [46] An artificial neural network for sound localization using binaural cues
    Datum, MS
    Palmieri, F
    Moiseff, A
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (01): : 372 - 383
  • [47] Binaural Sound Source Distance Estimation and Localization for a Moving Listener
    Krause, Daniel Aleksander
    Garcia-Barrios, Guillermo
    Politis, Archontis
    Mesaros, Annamaria
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 996 - 1011
  • [48] Binaural Sound Source Localization Based on Convolutional Neural Network
    Zhou, Lin
    Ma, Kangyu
    Wang, Lijie
    Chen, Ying
    Tang, Yibin
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (02): : 545 - 557
  • [49] Bionic Binaural Sound Localization Circuit Design Based On Memristor
    Song, Guohui
    Wang, Xiaoping
    Chen, Zhanfei
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 608 - 613
  • [50] BINAURAL SOUND ANALYSIS AND SPATIAL LOCALIZATION FOR THE VISUALLY IMPAIRED PEOPLE
    Balan, Oana
    Moldoveanu, Alin
    Moldoveanu, Florica
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCES ON INTERFACES AND HUMAN COMPUTER INTERACTION 2015, GAME AND ENTERTAINMENT TECHNOLOGIES 2015 AND COMPUTER GRAPHICS, VISUALIZATION, COMPUTER VISION AND IMAGE PROCESSING 2015, 2015, : 331 - 335