Direct Posterior Confidence for Out-of-Vocabulary Spoken Term Detection

被引:6
|
作者
Wang, Dong [1 ]
King, Simon [2 ]
Frankel, Joe [2 ]
Vipperla, Ravichander [3 ]
Evans, Nicholas [3 ]
Troncy, Raphael [3 ]
机构
[1] Nuance Commun, Aachen, Germany
[2] Univ Edinburgh, CSTR, Edinburgh EH8 9AB, Midlothian, Scotland
[3] EURECOM, Multimedia Dept, F-06904 Sophia Antipolis, France
基金
英国工程与自然科学研究理事会;
关键词
Speech recognition; spontaneous speech search; spoken term detection; DISCRIMINATIVE UTTERANCE VERIFICATION; SPEECH RECOGNITION; MINIMUM VERIFICATION; OOV QUERIES; WORD; PHONE; SYSTEM; ERROR; RETRIEVAL; SEARCH;
D O I
10.1145/2328967.2328969
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we first develop an extensive discussion about the modeling weakness problem associated with OOV terms, and then propose our approach to address this problem based on direct poster confidence. Our experiments carried out on spontaneous and conversational multiparty meeting speech, demonstrate that the proposed technique provides a significant improvement in STD performance as compared to conventional lattice-based confidence, in particular for OOV terms. Furthermore, the new confidence estimation approach is fused with other advanced techniques for OOV treatment, such as stochastic pronunciation modeling and discriminative confidence normalization. This leads to an integrated solution for OOV term detection that results in a large performance improvement.Spoken term detection (STD) is a key technology for spoken information retrieval. As compared to the conventional speech transcription and keyword spotting, STD is an open-vocabulary task and has to address out-of-vocabulary (OOV) terms. Approaches based on subword units, for example phones, are widely used to solve the OOV issue; however, performance on OOV terms is still substantially inferior to that of in-vocabulary (INV) terms. The performance degradation on OOV terms can be attributed to a multitude of factors. One particular factor we address in this article is the unreliable confidence estimation caused by weak acoustic and language modeling due to the absence of OOV terms in the training corpora. We propose a direct posterior confidence derived from a discriminative model, such as multilayer perceptron (MLP). The new confidence considers a wide-range acoustic context which is usually important for speech recognition and retrieval; moreover, it localizes on detected speech segments and therefore avoids the impact of long-span word context which is usually unreliable for OOV term detection.
引用
收藏
页数:34
相关论文
共 50 条
  • [41] RNN Language Model Estimation for Out-of-Vocabulary Words
    Illina, Irina
    Fohr, Dominique
    HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 199 - 211
  • [42] WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter
    Maity, Suman Kalyan
    Chaudhary, Anshit
    Kumar, Shraman
    Mukherjee, Animesh
    Sarda, Chaitanya
    Patil, Abhijeet
    Mondal, Akash
    PROCEEDINGS OF THE 19TH ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING COMPANION, 2016, : 341 - 344
  • [43] Handling Out-of-Vocabulary Words in Lexicons to Polarity Classification
    Nascimento, Gabriel
    Duarte, Fellipe
    Guedes, Gustavo Paiva
    PROCEEDINGS OF THE 17TH BRAZILIAN SYMPOSIUM ON HUMAN FACTORS IN COMPUTING SYSTEMS (IHC 2018), 2015,
  • [44] PatchBERT: Just-in-Time, Out-of-Vocabulary Patching
    Moon, Sangwhan
    Okazaki, Naoaki
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7846 - 7852
  • [45] Evolutionary discriminative confidence estimation for spoken term detection
    Javier Tejedor
    Alejandro Echeverría
    Dong Wang
    Ravichander Vipperla
    Multimedia Tools and Applications, 2013, 62 : 5 - 34
  • [46] Out-of-vocabulary rejection based on selective attention model
    Park, KY
    Lee, SY
    NEURAL PROCESSING LETTERS, 2000, 12 (01) : 41 - 48
  • [47] Evolutionary discriminative confidence estimation for spoken term detection
    Tejedor, Javier
    Echeverria, Alejandro
    Wang, Dong
    Vipperla, Ravichander
    MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 62 (01) : 5 - 34
  • [48] Out-of-Vocabulary Rejection based on Selective Attention Model
    Ki-Young Park
    Soo-Young Lee
    Neural Processing Letters, 2000, 12 : 41 - 48
  • [49] Similarity Scoring for Recognizing Repeated Out-of-Vocabulary Words
    Hannemann, Mirko
    Kombrink, Stefan
    Karafiat, Martin
    Burget, Lukas
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 897 - 900
  • [50] Subword RNNLM Approximations for Out-Of-Vocabulary Keyword Search
    Singh, Mittul
    Virpioja, Sami
    Smit, Peter
    Kurimo, Mikko
    INTERSPEECH 2019, 2019, : 4235 - 4239