Predicting search term reliability for spoken term detection systems

被引:2
|
作者
Torbati, Amir [1 ]
Picone, Joseph [1 ]
机构
[1] Temple Univ, Dept Elect & Comp Engn, 1947 North 12th St, Philadelphia, PA 19027 USA
基金
美国国家科学基金会;
关键词
Spoken term detection; Voice keyword search; Information retrieval;
D O I
10.1007/s10772-013-9197-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spoken term detection is an extension of text-based searching that allows users to type keywords and search audio files containing recordings of spoken language. Performance is dependent on many external factors such as the acoustic channel, language, pronunciation variations and acoustic confusability of the search term. Unlike text-based searches, the likelihoods of false alarms and misses for specific search terms, which we refer to as reliability, play a significant role in the overall perception of the usability of the system. In this paper, we present a system that predicts the reliability of a search term based on its inherent confusability. Our approach integrates predictors of the reliability that are based on both acoustic and phonetic features. These predictors are trained using an analysis of recognition errors produced from a state of the art spoken term detection system operating on the Fisher Corpus. This work represents the first large-scale attempt to predict the success of a keyword search term from only its spelling. We explore the complex relationship between phonetic and acoustic properties of search terms. We show that a 76 % correlation between the predicted error rate and the actual measured error rate can be achieved, and that the remaining confusability is due to other acoustic modeling issues that cannot be derived from a search term's spelling.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [31] Incorporating visual information for spoken term detection
    Kalantari, Shahram
    Dean, David
    Sridharan, Sridha
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 558 - 562
  • [32] Stochastic Pronunciation Modelling for Spoken Term Detection
    Wang, Dong
    King, Simon
    Frankel, Joe
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2091 - 2094
  • [33] An Empirical Study of Multilingual Spoken Term Detection
    Ma, Zejun
    Wang, Xiaorui
    Xu, Bo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1932 - 1935
  • [34] Web Derived Pronunciations for Spoken Term Detection
    Can, Dogan
    Cooper, Erica
    Ghoshal, Arnab
    Jansche, Martin
    Khudanpur, Sanjeev
    Ramabhadran, Bhuvana
    Riley, Michael
    Saraclar, Murat
    Sethy, Abhinav
    Ulinski, Morgan
    White, Christopher
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 83 - 90
  • [35] Model-Based Unsupervised Spoken Term Detection with Spoken Queries
    Chan, Chun-an
    Lee, Lin-shan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07): : 1330 - 1342
  • [36] Speech Signal Based Broad Phoneme Classification and Search Space Reduction for Spoken Term Detection
    Deekshitha, G.
    Mary, Leena
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1601 - 1606
  • [37] FACILITATING OPEN VOCABULARY SPOKEN TERM DETECTION USING A MULTIPLE PASS HYBRID SEARCH ALGORITHM
    Norouzian, Atta
    Rose, Richard
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5169 - 5172
  • [38] Introduction of False Detection Control Parameters in Spoken Term Detection
    Furuya, Yuto
    Natori, Satoshi
    Nishizaki, Hiromitsu
    Sekiguchi, Yoshihiro
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [39] APPLICATION OF OUT-OF-LANGUAGE DETECTION TO SPOKEN TERM DETECTION
    Motlicek, Petr
    Valente, Fabio
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5098 - 5101
  • [40] Using Conversational Word Bursts in Spoken Term Detection
    Chiu, Justin
    Rudnicky, Alexander
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2246 - 2250