Predicting search term reliability for spoken term detection systems

被引:2
|
作者
Torbati, Amir [1 ]
Picone, Joseph [1 ]
机构
[1] Temple Univ, Dept Elect & Comp Engn, 1947 North 12th St, Philadelphia, PA 19027 USA
基金
美国国家科学基金会;
关键词
Spoken term detection; Voice keyword search; Information retrieval;
D O I
10.1007/s10772-013-9197-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spoken term detection is an extension of text-based searching that allows users to type keywords and search audio files containing recordings of spoken language. Performance is dependent on many external factors such as the acoustic channel, language, pronunciation variations and acoustic confusability of the search term. Unlike text-based searches, the likelihoods of false alarms and misses for specific search terms, which we refer to as reliability, play a significant role in the overall perception of the usability of the system. In this paper, we present a system that predicts the reliability of a search term based on its inherent confusability. Our approach integrates predictors of the reliability that are based on both acoustic and phonetic features. These predictors are trained using an analysis of recognition errors produced from a state of the art spoken term detection system operating on the Fisher Corpus. This work represents the first large-scale attempt to predict the success of a keyword search term from only its spelling. We explore the complex relationship between phonetic and acoustic properties of search terms. We show that a 76 % correlation between the predicted error rate and the actual measured error rate can be achieved, and that the remaining confusability is due to other acoustic modeling issues that cannot be derived from a search term's spelling.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [1] ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION
    Torbati, Amir Hossein Harati Nejad
    Picone, Joe
    2013 IEEE INTERNATIONAL MULTI-DISCIPLINARY CONFERENCE ON COGNITIVE METHODS IN SITUATION AWARENESS AND DECISION SUPPORT (COGSIMA), 2013, : 114 - 117
  • [2] Merging Search Spaces for Subword Spoken Term Detection
    Mertens, Timo
    Schneider, Daniel
    Koehler, Joachim
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2075 - +
  • [3] Written Term Detection Improves Spoken Term Detection
    Yusuf, Bolaji
    Saraclar, Murat
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3213 - 3223
  • [4] Whisper-based spoken term detection systems for search on speech ALBAYZIN evaluation challenge
    Tejedor, Javier
    Toledano, Doroteo T.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [5] Whisper-based spoken term detection systems for search on speech ALBAYZIN evaluation challenge
    Javier Tejedor
    Doroteo T. Toledano
    EURASIP Journal on Audio, Speech, and Music Processing, 2024
  • [6] On the Calibration and Fusion of Heterogeneous Spoken Term Detection Systems
    Abad, Alberto
    Javier Rodriguez-Fuentes, Luis
    Penagarikano, Mikel
    Varona, Amparo
    Bordel, German
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 20 - 24
  • [7] A Phonetic Search Approach to the 2006 NIST Spoken Term Detection Evaluation
    Wallace, Roy
    Vogt, Robbie
    Sridharan, Sridha
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 557 - 560
  • [8] Spoken term detection system based on combination of LVCSR and phonetic search
    Szoeke, Igor
    Fapso, Michal
    Karafiat, Martin
    Burget, Lukas
    Grezl, Frantisek
    Schwarz, Petr
    Glembek, Ondrej
    Matejka, Pavel
    Kopecky, Jiri
    Cernocky, Jan Honza
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 237 - 247
  • [9] A Rescoring Method Using Web Search and Word Vectors for Spoken Term Detection
    Tanji, Haruka
    Kojima, Kazunori
    Nanjo, Hiroaki
    Lee, Shi-wook
    Itoh, Yoshiaki
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1163 - 1167
  • [10] Rapid and Accurate Spoken Term Detection
    Miller, David R. H.
    Kleber, Michael
    Kao, Chia-lin
    Kimball, Owen
    Colthurst, Thomas
    Lowe, Stephen A.
    Schwartz, Richard M.
    Gish, Herbert
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1965 - 1968