MULTIMODAL ADDRESSEE DETECTION IN MULTIPARTY DIALOGUE SYSTEMS

被引:0
|
作者
Tsai, T. J. [1 ]
Stolcke, Andreas [2 ]
Slaney, Malcolm [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Microsoft Res, Mountain View, CA USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
addressee detection; dialog system; multimodality; multiparty; human-human-computer;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Addressee detection answers the question, "Are you talking to me?" When multiple users interact with a dialogue system, it is important to know when a user is speaking to the computer and when he or she is speaking to another person. We approach this problem from a multimodal perspective, using lexical, acoustic, visual, dialog state, and beam-forming information. Using data from a multiparty dialogue system, we demonstrate the benefit of using multiple modalities over using a single modality. We also assess the relative importance of the various modalities in predicting the addressee. In our experiments, we find that acoustic features are by far the most important, that ASR and system-state information are useful, and that visual and beamforming features provide little additional benefit. Our study suggests that acoustic, lexical, and system state information are an effective, economical combination of modalities to use in addressee detection.
引用
收藏
页码:2314 / 2318
页数:5
相关论文
共 50 条
  • [1] Using Multimodal Information to Enhance Addressee Detection in Multiparty Interaction
    Malik, Usman
    Barange, Mukesh
    Saunier, Julien
    Pauchet, Alexandre
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 1, 2019, : 267 - 274
  • [2] Implementation and Evaluation of a Multimodal Addressee Identification Mechanism for Multiparty Conversation Systems
    Nakano, Yukiko I.
    Baba, Naoya
    Huang, Hung-Hsuan
    Hayashi, Yuki
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 35 - 42
  • [3] Deep Learning for Acoustic Addressee Detection in Spoken Dialogue Systems
    Pugachev, Aleksei
    Akhtiamov, Oleg
    Karpov, Alexey
    Minker, Wolfgang
    ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE, 2018, 789 : 45 - 53
  • [4] Multimodal Dialogue Management for Multiparty Interaction with Infants
    Gilani, Setareh Nasihati
    Traum, David
    Merla, Arcangelo
    Hee, Eugenia
    Walker, Zoey
    Manini, Barbara
    Gallagher, Grady
    Petitto, Laura-Ann
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 5 - 13
  • [5] Dialogue Acts Aided Important Utterance Detection Based on Multiparty and Multimodal Information
    Nihei, Fumio
    Ishii, Ryo
    Nakano, Yukiko I.
    Nishida, Kyosuke
    Masumura, Ryo
    Fukayama, Atsushi
    Nakamura, Takao
    INTERSPEECH 2022, 2022, : 1086 - 1090
  • [6] A Generic Machine Learning based Approach for Addressee Detection in Multiparty Interaction
    Malik, Usman
    Barange, Mukesh
    Ghannad, Naser
    Saunier, Julien
    Pauchet, Alexandre
    PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA' 19), 2019, : 119 - 126
  • [7] Tutoring Robots Multiparty Multimodal Social Dialogue with an Embodied Tutor
    Al Moubayed, Samer
    Beskow, Jonas
    Bollepalli, Bajibabu
    Hussen-Abdelaziz, Ahmed
    Johansson, Martin
    Koutsombogera, Maria
    Lopes, Jose David
    Novikova, Jekaterina
    Oertel, Catharine
    Skantze, Gabriel
    Stefanov, Kalin
    Varol, Guel
    INNOVATIVE AND CREATIVE DEVELOPMENTS IN MULTIMODAL INTERACTION SYSTEMS, 2014, 425 : 80 - +
  • [8] MULTIMODAL DIALOGUE SYSTEMS
    Rudnicky, Alexander I.
    SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS, 2005, 28 : 3 - 11
  • [9] EVALUATING DIALOGUE STRATEGIES IN MULTIMODAL DIALOGUE SYSTEMS
    Whittaker, Steve
    Walker, Marilyn
    SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS, 2005, 28 : 247 - 268
  • [10] Evaluation of Multimodal Dialogue Systems
    Bavarian Archive for Speech Signals, c/o Institut für Phonetik und Sprachliche Kommunikation, Ludwig-Maximilians-Universität Münchenn, Germany
    Cogn. Technol., 2006, (617-643):