Predicting search term reliability for spoken term detection systems

被引:2
|
作者
Torbati, Amir [1 ]
Picone, Joseph [1 ]
机构
[1] Temple Univ, Dept Elect & Comp Engn, 1947 North 12th St, Philadelphia, PA 19027 USA
基金
美国国家科学基金会;
关键词
Spoken term detection; Voice keyword search; Information retrieval;
D O I
10.1007/s10772-013-9197-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spoken term detection is an extension of text-based searching that allows users to type keywords and search audio files containing recordings of spoken language. Performance is dependent on many external factors such as the acoustic channel, language, pronunciation variations and acoustic confusability of the search term. Unlike text-based searches, the likelihoods of false alarms and misses for specific search terms, which we refer to as reliability, play a significant role in the overall perception of the usability of the system. In this paper, we present a system that predicts the reliability of a search term based on its inherent confusability. Our approach integrates predictors of the reliability that are based on both acoustic and phonetic features. These predictors are trained using an analysis of recognition errors produced from a state of the art spoken term detection system operating on the Fisher Corpus. This work represents the first large-scale attempt to predict the success of a keyword search term from only its spelling. We explore the complex relationship between phonetic and acoustic properties of search terms. We show that a 76 % correlation between the predicted error rate and the actual measured error rate can be achieved, and that the remaining confusability is due to other acoustic modeling issues that cannot be derived from a search term's spelling.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [11] Survey on Multilingual Spoken Term Detection
    Caranica, Alexandru
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY, 2017, 20 (03): : 210 - 221
  • [12] EXPLOITING DIVERSITY FOR SPOKEN TERM DETECTION
    Mangu, Lidia
    Soltau, Hagen
    Kuo, Hong-Kwang
    Kingsbury, Brian
    Saon, George
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8282 - 8286
  • [13] Optimization of Spoken Term Detection System
    Wang, Chuanxu
    Zhang, Pengyuan
    JOURNAL OF APPLIED MATHEMATICS, 2012,
  • [14] Lattice Indexing for Spoken Term Detection
    Can, Dogan
    Saraclar, Murat
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2338 - 2347
  • [15] Semantically Expanded Spoken Term Detection
    Kozhirbayev, Zhanibek
    Yessenbayev, Zhandos
    IEEE ACCESS, 2024, 12 : 177844 - 177855
  • [16] HANDLING OVERLAPS IN SPOKEN TERM DETECTION
    Wang, Dong
    Evans, Nicholas
    Troncy, Raphael
    King, Simon
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5656 - 5659
  • [17] Multilingual spoken term detection: a review
    G. Deekshitha
    Leena Mary
    International Journal of Speech Technology, 2020, 23 : 653 - 667
  • [18] Spoken term detection based on DTW
    Hou J.
    Xie L.
    Yang P.
    Xiao X.
    Leung C.-C.
    Xu H.
    Wang L.
    Lü H.
    Ma B.
    Chng E.
    Li H.
    Xie, Lei (lxie@nwpu.edu.cn), 1600, Tsinghua University (57): : 18 - 23
  • [19] Multilingual spoken term detection: a review
    Deekshitha, G.
    Mary, Leena
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 653 - 667
  • [20] Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification
    Toyohashi University of Technology, Japan
    IEEE Workshop Autom. Speech Recognit. Underst., ASRU - Proc., 2015, (200-206):