Streaming Intended Query Detection using E2E Modeling for Continued Conversation

被引:1
|
作者
Chang, Shuo-yiin [1 ]
Prakash, Guru [1 ]
Wu, Zelin [1 ]
Liang, Qiao [1 ]
Sainath, Tara N. [1 ]
Li, Bo [1 ]
Stambler, Adam [1 ]
Upadhyay, Shyam [1 ]
Faruqui, Manaal [1 ]
Strohman, Trevor [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
来源
关键词
end-to-end models; continued conversation;
D O I
10.21437/Interspeech.2022-569
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In voice-enabled applications, a predetermined hotword is usually used to activate a device in order to attend to the query. However, speaking queries followed by a hotword each time introduces a cognitive burden in continued conversations. To avoid repeating a hotword, we propose a streaming end-to-end (E2E) intended query detector that identifies the utterances directed towards the device and filters out other utterances not directed towards device. The proposed approach incorporates the intended query detector into the E2E model that already folds different components of the speech recognition pipeline into one neural network. The E2E modeling on speech decoding and intended query detection also allows us to declare a quick intended query detection based on early partial recognition result, which is important to decrease latency and make the system responsive. We demonstrate that the proposed E2E approach yields a 22% relative improvement on equal error rate (EER) for the detection accuracy and 600 ms latency improvement compared with an independent intended query detector. In our experiment, the proposed model detects whether the user is talking to the device with a 8.7% EER within 1.4 seconds of median latency after user starts speaking.
引用
收藏
页码:1826 / 1830
页数:5
相关论文
共 50 条
  • [31] E2E: An Optimized IPsec Architecture for Secure And Fast Offload
    Migault, Daniel
    Palomares, Daniel
    Herbert, Emmanuel
    You, Wei
    Ganne, Gabriel
    Arfaoui, Ghada
    Laurent, Maryline
    2012 SEVENTH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES), 2012, : 365 - 374
  • [33] An E2E simulator for 5G NR networks
    Patriciello, Natale
    Lagen, Sandra
    Bojovic, Biljana
    Giupponi, Lorenza
    SIMULATION MODELLING PRACTICE AND THEORY, 2019, 96
  • [34] 基于IMS的E2E QOS控制机制
    林晖
    万晓榆
    樊自甫
    微计算机信息, 2007, (03) : 37 - 39
  • [35] Mutta: a novel tool for E2E web mutation testing
    Maurizio Leotta
    Davide Paparella
    Filippo Ricca
    Software Quality Journal, 2024, 32 : 5 - 26
  • [36] USS DIRECTED E2E SPEECH SYNTHESIS FOR INDIAN LANGUAGES
    Srivastava, Sudhanshu
    Murthy, Hema A.
    2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM, 2022,
  • [37] TWD: A New Deep E2E Model for Text Watermark/Caption and Scene Text Detection in Video
    Banerjee, Ayan
    Shivakumara, Palaiahnakote
    Acharya, Parikshit
    Pal, Umapada
    Canet, Josep Llados
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1492 - 1498
  • [38] Scratch, Click & Vote: E2E Voting over the Internet
    Kutylowski, Miroslaw
    Zagorski, Filip
    TOWARDS TRUSTWORTHY ELECTIONS: NEW DIRECTIONS IN ELECTRONIC VOTING, 2010, 6000 : 343 - 356
  • [39] Simulation-based analysis of E2E voting systems
    de Marneffe, Olivier
    Pereira, Olivier
    Quisquater, Jean-Jacques
    E-VOTING AND IDENTITY, 2007, 4896 : 137 - 149
  • [40] Attacking Paper-Based E2E Voting Systems
    Kelsey, John
    Regenscheid, Andrew
    Moran, Tal
    Chaum, David
    TOWARDS TRUSTWORTHY ELECTIONS: NEW DIRECTIONS IN ELECTRONIC VOTING, 2010, 6000 : 370 - +