Speech-based Human-Robot Interaction Robust to Acoustic Reflections in Real Environment

被引:0
|
作者
Gomez, Randy [1 ]
Inoue, Koji
Nakamura, Keisuke [1 ]
Mizumoto, Takeshi [1 ]
Nakadai, Kazuhiro [1 ]
机构
[1] Honda Res Inst Japan Ltd Co, Wako, Saitama, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic reflection inside an enclosed environment is detrimental to human-robot interaction. Reflection may manifest as phantom sources emanating from unknown directions. In effect, a single speaker may falsely manifest as multiple speakers to the robot audition system, impeding the robot's ability to correctly associate the speech command to the actual speaker. Moreover, speech reflection smears the original speech signal due to reverberation. This degrades speech recognition and understanding performance. Conventional robot audition schemes that rely purely on acoustics and spatial information are very sensitive to acoustic reflection which ultimately leads to the failure in human-robot interaction. We propose a method for human-robot interaction robust to the effect of acoustic reflection. First, visual information is utilized and head tracking scheme is employed to reinforce the acoustic information with the visual presence of a prospect user. Second, we employ a model-based sound event identification scheme and scrutinize whether the acoustic information is likely to be speech or non-speech. Using all the information we have gathered, we create a simple rule construct to effectively discriminate the original source (actual speaker) from phantom sources (reflection). Consequently, the corresponding source identified as phantom (reflection) is used to estimate the unwanted smearing for effective suppression via speech enhancement. Experiments are conducted in human-robot interaction setting in which the proposed method outperforms the conventional method.
引用
收藏
页码:1367 / 1373
页数:7
相关论文
共 50 条
  • [1] Temporal Smearing Compensation in Reverberant Environment for Speech-based Human-Robot Interaction
    Gomez, Randy
    Nakamura, Keisuke
    Mizumoto, Takeshi
    Nakadai, Kazuhiro
    2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 3347 - 3353
  • [2] Environment Compensation Using A Posteriori Statistics for Distant Speech-based Human-Robot Interaction
    Gomez, Randy
    Nakamura, Keisuke
    2016 IEEE-RAS 16TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2016, : 1211 - 1216
  • [3] A tension-moderating mechanism for promoting speech-based human-robot interaction
    Kanda, T
    Iwase, K
    Shiomi, M
    Ishiguro, H
    2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 527 - 532
  • [4] Human Interaction Smart Subsystem-Extending Speech-Based Human-Robot Interaction Systems with an Implementation of External Smart Sensors
    Podpora, Michal
    Gardecki, Arkadiusz
    Beniak, Ryszard
    Klin, Bartlomiej
    Lopez Vicario, Jose
    Kawala-Sterniuk, Aleksandra
    SENSORS, 2020, 20 (08)
  • [5] Development and Validation of a Robust Speech Interface for Improved Human-Robot Interaction
    Atrash, Amin
    Kaplow, Robert
    Villemure, Julien
    West, Robert
    Yamani, Hiba
    Pineau, Joelle
    INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2009, 1 (04) : 345 - 356
  • [6] Development and Validation of a Robust Speech Interface for Improved Human-Robot Interaction
    Amin Atrash
    Robert Kaplow
    Julien Villemure
    Robert West
    Hiba Yamani
    Joelle Pineau
    International Journal of Social Robotics, 2009, 1 : 345 - 356
  • [7] Utilizing Visual Cues in Robot Audition for Sound Source Discrimination in Speech-based Human-Robot Communication
    Gomez, Randy
    Ivanchuk, Levko
    Nakamura, Keisuke
    Mizumoto, Takeshi
    Nakadai, Kazuhiro
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 4216 - 4222
  • [8] Human-robot interaction based on human emotions extracted from speech
    Kirandziska, Vesna
    Ackovska, Nevena
    2012 20TH TELECOMMUNICATIONS FORUM (TELFOR), 2012, : 1381 - 1384
  • [9] Research on multimodal human-robot interaction based on speech and gesture
    Deng Yongda
    Li Fang
    Xin Huang
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 72 : 443 - 454
  • [10] Toward a quizmaster robot for speech-based multiparty interaction
    Nishimuta, Izaya
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Okuno, Hiroshi G.
    ADVANCED ROBOTICS, 2015, 29 (18) : 1205 - 1219