MULTIMODAL ADDRESSEE DETECTION IN MULTIPARTY DIALOGUE SYSTEMS

被引：0

作者：

Tsai, T. J. ^{[1
]}

Stolcke, Andreas ^{[2
]}

Slaney, Malcolm ^{[2
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Microsoft Res, Mountain View, CA USA

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

addressee detection; dialog system; multimodality; multiparty; human-human-computer;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Addressee detection answers the question, "Are you talking to me?" When multiple users interact with a dialogue system, it is important to know when a user is speaking to the computer and when he or she is speaking to another person. We approach this problem from a multimodal perspective, using lexical, acoustic, visual, dialog state, and beam-forming information. Using data from a multiparty dialogue system, we demonstrate the benefit of using multiple modalities over using a single modality. We also assess the relative importance of the various modalities in predicting the addressee. In our experiments, we find that acoustic features are by far the most important, that ASR and system-state information are useful, and that visual and beamforming features provide little additional benefit. Our study suggests that acoustic, lexical, and system state information are an effective, economical combination of modalities to use in addressee detection.

引用

页码：2314 / 2318

页数：5

共 50 条

[41] Evaluation and usability of multimodal spoken language dialogue systems
Dybkjær, L
Bernsen, NO
Minker, W
SPEECH COMMUNICATION, 2004, 43 (1-2) : 33 - 54
[42] Blending speech and visual input in multimodal dialogue systems
Perakakis, Manolis
Toutoudakis, Michail
Potamianos, Alexandros
2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 142 - +
[43] Transformer-Based Multimodal Infusion Dialogue Systems
Liu, Bo
He, Lejian
Liu, Yafei
Yu, Tianyao
Xiang, Yuejia
Zhu, Li
Ruan, Weijian
ELECTRONICS, 2022, 11 (20)
[44] Multimodal dialogue systems: A case study for interactive TV
Ibrahim, A
Johansson, P
UNIVERSAL ACCESS: THEORETICAL PERSPECTIVES, PRACTICE, AND EXPERIENCE, 2003, 2615 : 209 - 218
[45] ACTIVE SPEAKER DETECTION IN HUMAN MACHINE MULTIPARTY DIALOGUE USING VISUAL PROSODY INFORMATION
Haider, Fasih
Campbell, Nick
Luz, Saturnino
2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 1207 - 1211
[46] A Multiparty Multimodal Architecture for Realtime Turntaking
Thorisson, Kristinn R.
Gislason, Olafur
Jonsdottir, Gudny Ragna
Thorisson, Hrafn Th.
INTELLIGENT VIRTUAL AGENTS, IVA 2010, 2010, 6356 : 350 - 356
[47] DIALOGUE IN PUBLICISTIC TEXT: RELATIONSHIP BETWEEN SENDER, ADDRESSEE AND OVERADRESSEE
Shalimova, Ekaterina V.
THEORETICAL AND PRACTICAL ISSUES OF JOURNALISM, 2014, (04): : 81 - 87
[48] Multiparty interaction: a multimodal perspective on relevance
Norris, S
DISCOURSE STUDIES, 2006, 8 (03) : 401 - 421
[49] Improving Voice Activity Detection for Multimodal Movie Dialogue Corpus
Kosaka, Tetsuo
Suga, Ikumi
Inoue, Masashi
2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 481 - 484
[50] Addressee Detection for Dialog Systems Using Temporal and Spectral Dimensions of Speaking Style
Shriberg, Elizabeth
Stolcke, Andreas
Ravuri, Suman
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2558 - 2562

← 1 2 3 4 5 →