Localizing concurrent sound sources with binaural microphones: A simulation study

被引:0
|
作者
Orr, Jakeh [1 ]
Ebel, William [1 ]
Gai, Yan [1 ]
机构
[1] St Louis Univ, Sch Sci & Engn, St Louis, MO 63105 USA
基金
美国国家科学基金会;
关键词
Sound localization; HRTF; ITD; ILD; Robotics; Sparseness; Reverberations; SOURCE LOCALIZATION; HEAD; IDENTIFICATION; FREQUENCY; PINNAE; ILD;
D O I
10.1016/j.heares.2023.108884
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The human auditory system can localize multiple sound sources using time, intensity, and frequency cues in the sound received by the two ears. Being able to spatially segregate the sources helps perception in a challenging condition when multiple sounds coexist. This study used model simulations to explore an algorithm for localizing multiple sources in azimuth with binaural (i.e., two) microphones. The algorithm relies on the "sparseness" property of daily signals in the time-frequency domain, and sound coming from different locations carrying unique spatial features will form clusters. Based on an interaural normalization procedure, the model generated spiral patterns for sound sources in the frontal hemifield. The model itself was created using broadband noise for better accuracy, because speech typically has sporadic energy at high frequencies. The model at an arbitrary frequency can be used to predict locations of speech and music that occurred alone or concurrently, and a classification algorithm was applied to measure the localization error. Under anechoic conditions, averaged errors in azimuth increased from 4.5 degrees to 19 degrees with RMS errors ranging from 6.4 degrees to 26.7 degrees as model frequency increased from 300 to 3000 Hz. The low-frequency model performance using short speech sound was notably better than the generalized cross-correlation model. Two types of room reverberations were then introduced to simulate difficult listening conditions. Model performance under reverberation was more resilient at low fre-quencies than at high frequencies. Overall, our study presented a spiral model for rapidly predicting horizontal locations of concurrent sound that is suitable for real-world scenarios.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A study on distance estimation in binaural sound localization
    Rodemann, Tobias
    IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, : 425 - 430
  • [32] PSYCHOPHYSICAL VERIFICATION OF PREDICTED INTERAURAL DIFFERENCES IN LOCALIZING DISTANT SOUND SOURCES
    MOLINO, JA
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (01): : 139 - 147
  • [33] A sound sources and reflections localization method for reverberant rooms using arrays of microphones
    Roper, Simon
    Collins, Tim
    Proceedings of the AES International Conference, 2007,
  • [34] Localization of Sound Sources in Turbulent Jet by Planar Array with Optimized Arrangement of Microphones
    Ershov, V. V.
    Palchikovskiy, V. V.
    Kustov, O. Yu.
    INTERNATIONAL CONFERENCE ON THE METHODS OF AEROPHYSICAL RESEARCH (ICMAR 2018), 2018, 2027
  • [35] Snoring sound intensity study with ambient and traqueal microphones
    Solà-Soler, J
    Jané, R
    Fiz, JA
    Morera, J
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 2032 - 2035
  • [36] Separation of concurrent broadband sound sources by human listeners
    Best, V
    van Schaik, A
    Carlile, S
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 115 (01): : 324 - 336
  • [37] Separation of concurrent broadband sound sources by human listeners
    Carlile, S. (simonc@physiol.usyd.edu), 1600, Acoustical Society of America (115):
  • [38] Binaural Localization of Multiple Sound Sources by Non-Negative Tensor Factorization
    Benaroya, Elie Laurent
    Obin, Nicolas
    Liuni, Marco
    Roebel, Axel
    Raumel, Wilson
    Argentieri, Sylvain
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (06) : 1068 - 1078
  • [39] Roles of GABAergic inhibition for the binaural processing of multiple sound sources in the inferior colliculus
    Pollak, GD
    ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 1997, 106 (05): : 44 - 54
  • [40] Location Classification of Nonstationary Sound Sources Using Binaural Room Distribution Patterns
    Hu, Jwu-Sheng
    Liu, Wei-Han
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 682 - 692