Predicting Who Will Be the Next Speaker and When in Multi-party Meetings

被引:0
|
作者
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
机构
来源
NTT Technical Review | 2015年 / 13卷 / 07期
关键词
Speech processing - Video conferencing;
D O I
暂无
中图分类号
学科分类号
摘要
An understanding of the mechanisms involved in face-to-face communication will contribute to designing advanced video conferencing and dialogue systems. Turn-taking, the situation where the speaker changes, is especially important in multi-party meetings. For smooth turn-taking, the participants need to predict who will start speaking next and to consider a strategy for achieving good timing to speak next. Our aim is to clarify the kinds of behavior that contribute to smooth turn-taking and to develop a model for predicting the next speaker and the start time of the next speaker’s utterance in multi-party meetings. We focus on gaze behavior and respiration near the end of the current speaker’s utterance. We empirically demonstrate that gaze behavior and respiration have a relation to the next speaker and the start timing of the next utterance in multi-party meetings. A prediction model based on the results reveals that gaze behavior and respiration contribute to predicting the next speaker and the timing of the next utterance. © 2015 Nippon Telegraph and Telephone Corp.. All rights reserved.
引用
收藏
相关论文
共 50 条
  • [1] PREDICTING NEXT SPEAKER BASED ON HEAD MOVEMENT IN MULTI-PARTY MEETINGS
    Ishii, Ryo
    Kumano, Shiro
    Otsuka, Kazuhiro
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2319 - 2323
  • [2] Multimodal Fusion using Respiration and Gaze for Predicting Next Speaker in Multi-Party Meetings
    Ishii, Ryo
    Kumano, Shiro
    Otsuka, Kazuhiro
    ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 99 - 106
  • [3] Predicting Next Speaker and Timing from Gaze Transition Patterns in Multi-Party Meetings
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Matsuda, Masafumi
    Yamato, Junji
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 79 - 86
  • [4] Analyzing Mouth-Opening Transition Pattern for Predicting Next Speaker in Multi-party Meetings
    Ishii, Ryo
    Kumano, Shiro
    Otsuka, Kazuhiro
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 209 - 216
  • [5] Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Higashinaka, Ryuichiro
    Tomita, Junji
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (04)
  • [6] Speaker diarization for multi-party meetings using acoustic fusion
    Anguera, X
    Wooters, C
    Hernando, J
    2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
  • [7] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
    Hung, Hayley
    Huang, Yan
    Friedland, Gerald
    Gatica-Perez, Daniel
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
  • [8] A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
    Yu, Fan
    Du, Zhihao
    Zhang, Shiliang
    Lin, Yuxiao
    Xie, Lei
    INTERSPEECH 2022, 2022, : 560 - 564
  • [9] Prediction of Who Will Be the Next Speaker and When Using Gaze Behavior in Multiparty Meetings
    Ishii, Ryo
    Otsuka, Kazuhiro
    Kumano, Shiro
    Yamato, Junji
    ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 6 (01)
  • [10] On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings
    Kolar, Jachym
    Shriberg, Elizabeth
    Liu, Yang
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2014 - 2017