Improved sound source localization in horizontal plane for binaural robot audition

被引:24
|
作者
Kim, Ui-Hyun [1 ]
Nakadai, Kazuhiro [2 ]
Okuno, Hiroshi G. [1 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Dept Intelligence Sci & Technol, Kyoto, Japan
[2] Honda Res Inst Japan Co Ltd, Wako, Saitama, Japan
基金
日本学术振兴会;
关键词
Intelligent robot audition; Human-robot interaction; Sound source localization; Front-back disambiguation; FRONT-BACK CONFUSION; TIME-DELAY; RESOLUTION;
D O I
10.1007/s10489-014-0544-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with binaural robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a binaural robot platform: 1) diffraction of sound waves with multipath interference caused by the contours of the robot head, which affects localization accuracy, and 2) front-back ambiguity, which limits the localization range to half the horizontal space. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using two dummy heads equipped with small or large pinnae showed that localization errors were reduced by 8.91A degrees (3.21A degrees vs. 12.12A degrees) on average with the new time delay factor compared with the conventional GCC-PHAT method and that the success rate for front-back disambiguation using the pinnae amplification effect was 29.76 % (93.46 % vs. 72.02 %) better on average over the entire azimuth than with a conventional head related transfer function (HRTF)-based method.
引用
收藏
页码:63 / 74
页数:12
相关论文
共 50 条
  • [1] Improved sound source localization in horizontal plane for binaural robot audition
    Ui-Hyun Kim
    Kazuhiro Nakadai
    Hiroshi G. Okuno
    Applied Intelligence, 2015, 42 : 63 - 74
  • [2] Improvement of Speaker Localization by Considering Multipath Interference of Sound Wave for Binaural Robot Audition
    Kim, Ui-Hyun
    Mizumoto, Takeshi
    Ogata, Tetsuya
    Okuno, Hiroshi G.
    2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2011, : 2910 - 2915
  • [3] Interactive Sound Source Localization using Robot Audition for Tablet Devices
    Nakamura, Keisuke
    Sinapayen, Lana
    Nakadai, Kazuhiro
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 6137 - 6142
  • [4] BINAURAL SOUND LOCATION IN THE HORIZONTAL PLANE
    DUYFF, JW
    VANGEMERT, AGM
    SCHMIDT, PH
    ACTA PHYSIOLOGICA ET PHARMACOLOGICA NEERLANDICA, 1950, 1 (04): : 540 - 561
  • [5] Horizontal plane sound source localization and auditory enhancement
    Smith, Jan R.
    Lombard, Wesley R.
    Shaba, Moses N.
    WORK-A JOURNAL OF PREVENTION ASSESSMENT & REHABILITATION, 2012, 41 : 1994 - 2000
  • [6] A linear phase unwrapping method for binaural sound source localization on a robot
    Li, DF
    Levinson, SE
    2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, 2002, : 19 - 23
  • [7] Binaural localization for a mobile sound source
    Kumon M.
    Uozumi S.
    Journal of Biomechanical Science and Engineering, 2011, 6 (01): : 26 - 39
  • [8] Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization
    Davila-Chacon, Jorge
    Liu, Jindong
    Wermter, Stefan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (01) : 138 - 150
  • [9] Applying scattering theory to robot audition system: Robust sound source localization and extraction
    Nakadai, K
    Matsuura, D
    Okuno, HG
    Kitano, H
    IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 1147 - 1152
  • [10] SOUND SOURCE SEPARATION OF MOVING SPEAKERS FOR ROBOT AUDITION
    Nakadai, Kazuhiro
    Nakajima, Hirofumi
    Hasegawa, Yuji
    Tsujino, Hiroshi
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3685 - 3688