Direct Posterior Confidence for Out-of-Vocabulary Spoken Term Detection

被引:6
|
作者
Wang, Dong [1 ]
King, Simon [2 ]
Frankel, Joe [2 ]
Vipperla, Ravichander [3 ]
Evans, Nicholas [3 ]
Troncy, Raphael [3 ]
机构
[1] Nuance Commun, Aachen, Germany
[2] Univ Edinburgh, CSTR, Edinburgh EH8 9AB, Midlothian, Scotland
[3] EURECOM, Multimedia Dept, F-06904 Sophia Antipolis, France
基金
英国工程与自然科学研究理事会;
关键词
Speech recognition; spontaneous speech search; spoken term detection; DISCRIMINATIVE UTTERANCE VERIFICATION; SPEECH RECOGNITION; MINIMUM VERIFICATION; OOV QUERIES; WORD; PHONE; SYSTEM; ERROR; RETRIEVAL; SEARCH;
D O I
10.1145/2328967.2328969
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we first develop an extensive discussion about the modeling weakness problem associated with OOV terms, and then propose our approach to address this problem based on direct poster confidence. Our experiments carried out on spontaneous and conversational multiparty meeting speech, demonstrate that the proposed technique provides a significant improvement in STD performance as compared to conventional lattice-based confidence, in particular for OOV terms. Furthermore, the new confidence estimation approach is fused with other advanced techniques for OOV treatment, such as stochastic pronunciation modeling and discriminative confidence normalization. This leads to an integrated solution for OOV term detection that results in a large performance improvement.Spoken term detection (STD) is a key technology for spoken information retrieval. As compared to the conventional speech transcription and keyword spotting, STD is an open-vocabulary task and has to address out-of-vocabulary (OOV) terms. Approaches based on subword units, for example phones, are widely used to solve the OOV issue; however, performance on OOV terms is still substantially inferior to that of in-vocabulary (INV) terms. The performance degradation on OOV terms can be attributed to a multitude of factors. One particular factor we address in this article is the unreliable confidence estimation caused by weak acoustic and language modeling due to the absence of OOV terms in the training corpora. We propose a direct posterior confidence derived from a discriminative model, such as multilayer perceptron (MLP). The new confidence considers a wide-range acoustic context which is usually important for speech recognition and retrieval; moreover, it localizes on detected speech segments and therefore avoids the impact of long-span word context which is usually unreliable for OOV term detection.
引用
收藏
页数:34
相关论文
共 50 条
  • [21] Variable-Span Out-of-Vocabulary Named Entity Detection
    Chen, Wei
    Ananthakrishnan, Sankaranarayanan
    Prasad, Rohit
    Natarajan, Prem
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3728 - 3732
  • [22] Recurrent out-of-vocabulary word detection based on distribution of features
    Asami, Taichi
    Masumura, Ryo
    Aono, Yushi
    Shinoda, Koichi
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 247 - 259
  • [23] Exploiting Out-of-Vocabulary Words for Out-of-Domain Detection in Dialog Systems
    Ryu, Seonghan
    Lee, Donghyeon
    Lee, Gary Geunbae
    Kim, Kyungduk
    Noh, Hyungjong
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 165 - +
  • [24] SUB-WORD MODELING OF OUT OF VOCABULARY WORDS IN SPOKEN TERM DETECTION
    Szoke, Igor
    Burget, Lukas
    Cernocky, Jan
    Fapso, Michal
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 273 - 276
  • [25] Finding Recurrent Out-of-Vocabulary Words
    Qin, Long
    Rudnicky, Alexander
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2241 - 2245
  • [26] Improving out-of-vocabulary name resolution
    Palmer, DD
    Ostendorf, M
    COMPUTER SPEECH AND LANGUAGE, 2005, 19 (01): : 107 - 128
  • [27] ZERO RESOURCE GRAPH-BASED CONFIDENCE ESTIMATION FOR OPEN VOCABULARY SPOKEN TERM DETECTION
    Norouzian, Atta
    Rose, Richard
    Ghalehjegh, Sina Hamidi
    Jansen, Aren
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8292 - 8296
  • [28] Formalization of rules for the detection of plurals in Spanish in the case of out-of-vocabulary units
    Nazar, Rogelio
    Galdames, Amparo
    LINGUAMATICA, 2019, 11 (02): : 17 - 32
  • [29] USING SYNTACTIC AND CONFUSION NETWORK STRUCTURE FOR OUT-OF-VOCABULARY WORD DETECTION
    Marin, Alex
    Kwiatkowski, Tom
    Ostendorf, Mari
    Zettlemoyer, Luke
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 159 - 164
  • [30] OUT-OF-VOCABULARY WORD DETECTION IN A SPEECH-TO-SPEECH TRANSLATION SYSTEM
    Kuo, Hong-Kwang
    Kislal, Ellen Eide
    Mangu, Lidia
    Soltau, Hagen
    Beran, Tomas
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,