Direct Posterior Confidence for Out-of-Vocabulary Spoken Term Detection

被引:6
|
作者
Wang, Dong [1 ]
King, Simon [2 ]
Frankel, Joe [2 ]
Vipperla, Ravichander [3 ]
Evans, Nicholas [3 ]
Troncy, Raphael [3 ]
机构
[1] Nuance Commun, Aachen, Germany
[2] Univ Edinburgh, CSTR, Edinburgh EH8 9AB, Midlothian, Scotland
[3] EURECOM, Multimedia Dept, F-06904 Sophia Antipolis, France
基金
英国工程与自然科学研究理事会;
关键词
Speech recognition; spontaneous speech search; spoken term detection; DISCRIMINATIVE UTTERANCE VERIFICATION; SPEECH RECOGNITION; MINIMUM VERIFICATION; OOV QUERIES; WORD; PHONE; SYSTEM; ERROR; RETRIEVAL; SEARCH;
D O I
10.1145/2328967.2328969
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we first develop an extensive discussion about the modeling weakness problem associated with OOV terms, and then propose our approach to address this problem based on direct poster confidence. Our experiments carried out on spontaneous and conversational multiparty meeting speech, demonstrate that the proposed technique provides a significant improvement in STD performance as compared to conventional lattice-based confidence, in particular for OOV terms. Furthermore, the new confidence estimation approach is fused with other advanced techniques for OOV treatment, such as stochastic pronunciation modeling and discriminative confidence normalization. This leads to an integrated solution for OOV term detection that results in a large performance improvement.Spoken term detection (STD) is a key technology for spoken information retrieval. As compared to the conventional speech transcription and keyword spotting, STD is an open-vocabulary task and has to address out-of-vocabulary (OOV) terms. Approaches based on subword units, for example phones, are widely used to solve the OOV issue; however, performance on OOV terms is still substantially inferior to that of in-vocabulary (INV) terms. The performance degradation on OOV terms can be attributed to a multitude of factors. One particular factor we address in this article is the unreliable confidence estimation caused by weak acoustic and language modeling due to the absence of OOV terms in the training corpora. We propose a direct posterior confidence derived from a discriminative model, such as multilayer perceptron (MLP). The new confidence considers a wide-range acoustic context which is usually important for speech recognition and retrieval; moreover, it localizes on detected speech segments and therefore avoids the impact of long-span word context which is usually unreliable for OOV term detection.
引用
收藏
页数:34
相关论文
共 50 条
  • [31] Enhancing Out-of-Vocabulary Estimation with Subword Attention
    Patel, Raj
    Domeniconi, Carlotta
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3592 - 3601
  • [32] Lexicon Stratification for Translating Out-of-Vocabulary Words
    Tsvetkov, Yulia
    Dyer, Chris
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 125 - 131
  • [33] English Out-of-Vocabulary Lexical Evaluation Task
    Wang, Han
    Wang, Ye
    Zhang, Xinxiang
    Lu, Mi
    Choe, Yoonsuck
    Cao, Jingjing
    2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2019, : 1468 - 1472
  • [34] EFFICIENT OUT-OF-VOCABULARY TERM DETECTION BY N-GRAM ARRAY INDICES WITH DISTANCE FROM A SYLLABLE LATTICE
    Iwami, Keisuke
    Fujii, Yasuhisa
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5664 - 5667
  • [35] Chinese Word Segmentation and Out-Of-Vocabulary Words Detection Using Suffix Array
    Ji Wenyan
    Peng Tao
    Zuo Wanli
    He Fengling
    Zhu Huifeng
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 56 - 60
  • [36] Combined low level and high level features for Out-Of-Vocabulary Word detection
    Lecouteux, Benjamin
    Linares, Georges
    Favre, Benoit
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1199 - +
  • [37] Contextual Verification for Open Vocabulary Spoken Term Detection
    Schneider, Daniel
    Mertens, Timo
    Larson, Martha
    Koehler, Joachim
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 697 - 700
  • [38] An approach for efficient open vocabulary spoken term detection
    Norouzian, Atta
    Rose, Richard
    SPEECH COMMUNICATION, 2014, 57 : 50 - 62
  • [39] COPING WITH OUT-OF-VOCABULARY WORDS: OPEN VERSUS HUGE VOCABULARY ASR
    Gerosa, Matteo
    Federico, Marcello
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4313 - 4316
  • [40] Handling Out-Of-Vocabulary Problem in Hangeul Word Embeddings
    Kwon, Ohjoon
    Kim, Dohyun
    Lee, Soo-Ryeon
    Choi, Junyoung
    Lee, SangKeun
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3213 - 3221