Predicting Who Will Be the Next Speaker and When in Multi-party Meetings

被引：0

作者：

Ishii, Ryo

Otsuka, Kazuhiro

Kumano, Shiro

Yamato, Junji

机构：

来源：

NTT Technical Review | 2015年 / 13卷 / 07期

关键词：

Speech processing - Video conferencing;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

An understanding of the mechanisms involved in face-to-face communication will contribute to designing advanced video conferencing and dialogue systems. Turn-taking, the situation where the speaker changes, is especially important in multi-party meetings. For smooth turn-taking, the participants need to predict who will start speaking next and to consider a strategy for achieving good timing to speak next. Our aim is to clarify the kinds of behavior that contribute to smooth turn-taking and to develop a model for predicting the next speaker and the start time of the next speaker’s utterance in multi-party meetings. We focus on gaze behavior and respiration near the end of the current speaker’s utterance. We empirically demonstrate that gaze behavior and respiration have a relation to the next speaker and the start timing of the next utterance in multi-party meetings. A prediction model based on the results reveals that gaze behavior and respiration contribute to predicting the next speaker and the timing of the next utterance. © 2015 Nippon Telegraph and Telephone Corp.. All rights reserved.

引用

共 50 条

[1] PREDICTING NEXT SPEAKER BASED ON HEAD MOVEMENT IN MULTI-PARTY MEETINGS
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2319 - 2323
[2] Multimodal Fusion using Respiration and Gaze for Predicting Next Speaker in Multi-Party Meetings
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 99 - 106
[3] Predicting Next Speaker and Timing from Gaze Transition Patterns in Multi-Party Meetings
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Matsuda, Masafumi
Yamato, Junji
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 79 - 86
[4] Analyzing Mouth-Opening Transition Pattern for Predicting Next Speaker in Multi-party Meetings
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 209 - 216
[5] Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Higashinaka, Ryuichiro
Tomita, Junji
MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (04)
[6] Speaker diarization for multi-party meetings using acoustic fusion
Anguera, X
Wooters, C
Hernando, J
2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
[7] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
Hung, Hayley
Huang, Yan
Friedland, Gerald
Gatica-Perez, Daniel
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
[8] A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
Yu, Fan
Du, Zhihao
Zhang, Shiliang
Lin, Yuxiao
Xie, Lei
INTERSPEECH 2022, 2022, : 560 - 564
[9] Prediction of Who Will Be the Next Speaker and When Using Gaze Behavior in Multiparty Meetings
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 6 (01)
[10] On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings
Kolar, Jachym
Shriberg, Elizabeth
Liu, Yang
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2014 - 2017

← 1 2 3 4 5 →