Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework

被引:0
|
作者
Vavruska, Jan [1 ]
Svec, Jan [1 ]
Ircing, Pavel [1 ]
机构
[1] Univ W Bohemia, Dept Cybernet, Plzen 30614, Czech Republic
来源
TEXT, SPEECH, AND DIALOGUE, TSD 2013 | 2013年 / 8082卷
关键词
spoken term detection; finite state automata;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents a technique for phonetic spoken term detection in large audio archive. It is designed within the framework of weighted finite-state transducers and utilizes the rather recently developed notion of factor automata, which we have enhanced with a score normalization and a technique for systematic query expansion which allows for phone deletions and substitutions and consequently compensates for frequent pronunciation imperfections and systematic phoneme interchanges occurring during the ASR decoding process. The experiments presented in the paper show that the new WFST-based method outperforms the baseline system both in terms of search performance and speed. Finally, the paper discusses the issues of the proposed techniques that need to be addressed before the application in real-life tasks.
引用
收藏
页码:402 / 409
页数:8
相关论文
共 50 条
  • [21] USING RHYTHMIC FEATURES FOR JAPANESE SPOKEN TERM DETECTION
    Kanda, Naoyuki
    Takeda, Ryu
    Obuchi, Yasunari
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 170 - 175
  • [22] EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS
    Mangu, Lidia
    Kingsbury, Brian
    Soltau, Hagen
    Kuo, Hong-Kwang
    Picheny, Michael
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [23] Spoken Term Detection Using Visual Spectrogram Matching
    Lazic, Nevena
    Aarabi, Parham
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 637 - 642
  • [24] Selection of Best Match Keyword using Spoken Term Detection for Spoken Document Indexing
    Domoto, Kentaro
    Utsuro, Takehito
    Sawada, Naoki
    Nishizaki, Hiromitsu
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [25] INTEGRATING RECOGNITION AND RETRIEVAL WITH USER FEEDBACK: A NEW FRAMEWORK FOR SPOKEN TERM DETECTION
    Lee, Hung-yi
    Lee, Lin-shan
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5290 - 5293
  • [26] Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task
    Akiba, Tomoyosi
    Nishizaki, Hiromitsu
    Aikawa, Kiyoaki
    Kawahara, Tatsuya
    Matsui, Tomoko
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3527 - 3534
  • [27] Emotion Classification of Spontaneous Speech Using Spoken Term Detection
    Nishizaki, Hiromitsu
    Watase, Kei
    2017 IEEE 6TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2017,
  • [28] Evaluation of Fast Spoken Term Detection Using a Suffix Array
    Katsurada, Kouichi
    Sawada, Shinta
    Teshima, Shigeki
    Iribe, Yurie
    Nitta, Tsuneo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 916 - 919
  • [29] Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection
    Madhavi, Maulik C.
    Patil, Hemant A.
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 175 - 202
  • [30] Speaker Verification and Spoken Language Identification using a Generalized I-vector Framework with Phonetic Tokenizations and Tandem Features
    Li, Ming
    Liu, Wenbo
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1120 - 1124