Can You Repeat That? Using Word Repetition to Improve Spoken Term Detection

被引:0
|
作者
Wintrode, Jonathan [1 ]
Khudanpur, Sanjeev [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
关键词
LANGUAGE MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We aim to improve spoken term detection performance by incorporating contextual information beyond traditional N-gram language models. Instead of taking a broad view of topic context in spoken documents, variability of word co-occurrence statistics across corpora leads us to focus instead the on phenomenon of word repetition within single documents. We show that given the detection of one instance of a term we are more likely to find additional instances of that term in the same document. We leverage this burstiness of keywords by taking the most confident keyword hypothesis in each document and interpolating with lower scoring hits. We then develop a principled approach to select interpolation weights using only the ASR training data. Using this re-weighting approach we demonstrate consistent improvement in the term detection performance across all five languages in the BABEL program.
引用
收藏
页码:1316 / 1325
页数:10
相关论文
共 50 条
  • [1] Using Conversational Word Bursts in Spoken Term Detection
    Chiu, Justin
    Rudnicky, Alexander
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2246 - 2250
  • [2] A Rescoring Method Using Web Search and Word Vectors for Spoken Term Detection
    Tanji, Haruka
    Kojima, Kazunori
    Nanjo, Hiroaki
    Lee, Shi-wook
    Itoh, Yoshiaki
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1163 - 1167
  • [3] Can You Read That Again? Playwriting, Literacy and Reading the 'Spoken' Word
    Gardiner, Paul
    Anderson, Michael
    ENGLISH IN AUSTRALIA, 2012, 47 (02): : 80 - 89
  • [4] Examining long-term repetition priming effects in spoken word recognition using computer mouse tracking
    Tuft, Samantha. E. E.
    Incera, Sara
    McLennan, Conor. T. T.
    FRONTIERS IN PSYCHOLOGY, 2023, 13
  • [5] CONSTRUCTING SUB-WORD UNITS FOR SPOKEN TERM DETECTION
    van Heerden, Charl
    Karakos, Damianos
    Narasimhan, Karthik
    Davel, Marelie
    Schwartz, Richard
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5780 - 5784
  • [6] Spoken term detection based on unconstrained word Graph extension
    Zhang, Zhen
    Si, Yujing
    Zhao, Qingwei
    Yan, Yonghong
    Journal of Information and Computational Science, 2013, 10 (18): : 5881 - 5890
  • [7] Long-Term Repetition Priming in Spoken and Written Word Production: Evidence for a Contribution of Phonology to Handwriting
    Damian, Markus F.
    Dorjee, Dusana
    Stadthagen-Gonzalez, Hans
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 2011, 37 (04) : 813 - 826
  • [9] Robust Spoken Term Detection Using Combination of Phone-Based and Word-Based Recognition
    Iwata, Kenji
    Shinoda, Koichi
    Furui, Sadaoki
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2195 - 2198
  • [10] SUB-WORD MODELING OF OUT OF VOCABULARY WORDS IN SPOKEN TERM DETECTION
    Szoke, Igor
    Burget, Lukas
    Cernocky, Jan
    Fapso, Michal
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 273 - 276