MULTIMODAL ADDRESSEE DETECTION IN MULTIPARTY DIALOGUE SYSTEMS

被引:0
|
作者
Tsai, T. J. [1 ]
Stolcke, Andreas [2 ]
Slaney, Malcolm [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Microsoft Res, Mountain View, CA USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
addressee detection; dialog system; multimodality; multiparty; human-human-computer;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Addressee detection answers the question, "Are you talking to me?" When multiple users interact with a dialogue system, it is important to know when a user is speaking to the computer and when he or she is speaking to another person. We approach this problem from a multimodal perspective, using lexical, acoustic, visual, dialog state, and beam-forming information. Using data from a multiparty dialogue system, we demonstrate the benefit of using multiple modalities over using a single modality. We also assess the relative importance of the various modalities in predicting the addressee. In our experiments, we find that acoustic features are by far the most important, that ASR and system-state information are useful, and that visual and beamforming features provide little additional benefit. Our study suggests that acoustic, lexical, and system state information are an effective, economical combination of modalities to use in addressee detection.
引用
收藏
页码:2314 / 2318
页数:5
相关论文
共 50 条
  • [41] Evaluation and usability of multimodal spoken language dialogue systems
    Dybkjær, L
    Bernsen, NO
    Minker, W
    SPEECH COMMUNICATION, 2004, 43 (1-2) : 33 - 54
  • [42] Blending speech and visual input in multimodal dialogue systems
    Perakakis, Manolis
    Toutoudakis, Michail
    Potamianos, Alexandros
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 142 - +
  • [43] Transformer-Based Multimodal Infusion Dialogue Systems
    Liu, Bo
    He, Lejian
    Liu, Yafei
    Yu, Tianyao
    Xiang, Yuejia
    Zhu, Li
    Ruan, Weijian
    ELECTRONICS, 2022, 11 (20)
  • [44] Multimodal dialogue systems: A case study for interactive TV
    Ibrahim, A
    Johansson, P
    UNIVERSAL ACCESS: THEORETICAL PERSPECTIVES, PRACTICE, AND EXPERIENCE, 2003, 2615 : 209 - 218
  • [45] ACTIVE SPEAKER DETECTION IN HUMAN MACHINE MULTIPARTY DIALOGUE USING VISUAL PROSODY INFORMATION
    Haider, Fasih
    Campbell, Nick
    Luz, Saturnino
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 1207 - 1211
  • [46] A Multiparty Multimodal Architecture for Realtime Turntaking
    Thorisson, Kristinn R.
    Gislason, Olafur
    Jonsdottir, Gudny Ragna
    Thorisson, Hrafn Th.
    INTELLIGENT VIRTUAL AGENTS, IVA 2010, 2010, 6356 : 350 - 356
  • [47] DIALOGUE IN PUBLICISTIC TEXT: RELATIONSHIP BETWEEN SENDER, ADDRESSEE AND OVERADRESSEE
    Shalimova, Ekaterina V.
    THEORETICAL AND PRACTICAL ISSUES OF JOURNALISM, 2014, (04): : 81 - 87
  • [48] Multiparty interaction: a multimodal perspective on relevance
    Norris, S
    DISCOURSE STUDIES, 2006, 8 (03) : 401 - 421
  • [49] Improving Voice Activity Detection for Multimodal Movie Dialogue Corpus
    Kosaka, Tetsuo
    Suga, Ikumi
    Inoue, Masashi
    2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 481 - 484
  • [50] Addressee Detection for Dialog Systems Using Temporal and Spectral Dimensions of Speaking Style
    Shriberg, Elizabeth
    Stolcke, Andreas
    Ravuri, Suman
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2558 - 2562