Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework

被引:0
|
作者
Vavruska, Jan [1 ]
Svec, Jan [1 ]
Ircing, Pavel [1 ]
机构
[1] Univ W Bohemia, Dept Cybernet, Plzen 30614, Czech Republic
来源
TEXT, SPEECH, AND DIALOGUE, TSD 2013 | 2013年 / 8082卷
关键词
spoken term detection; finite state automata;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents a technique for phonetic spoken term detection in large audio archive. It is designed within the framework of weighted finite-state transducers and utilizes the rather recently developed notion of factor automata, which we have enhanced with a score normalization and a technique for systematic query expansion which allows for phone deletions and substitutions and consequently compensates for frequent pronunciation imperfections and systematic phoneme interchanges occurring during the ASR decoding process. The experiments presented in the paper show that the new WFST-based method outperforms the baseline system both in terms of search performance and speed. Finally, the paper discusses the issues of the proposed techniques that need to be addressed before the application in real-life tasks.
引用
收藏
页码:402 / 409
页数:8
相关论文
共 50 条
  • [1] Cross database audio visual speech adaptation for phonetic spoken term detection
    Kalantari, Shahram
    Dean, David
    Sridharan, Sridha
    COMPUTER SPEECH AND LANGUAGE, 2017, 44 : 1 - 21
  • [2] SPOKEN TERM DETECTION USING FAST PHONETIC DECODING
    Wallace, Roy
    Vogt, Robbie
    Sridharan, Sridha
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4881 - 4884
  • [3] System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
    Josef Psutka
    Jan Švec
    Josef V Psutka
    Jan Vaněk
    Aleš Pražák
    Luboš Šmídl
    Pavel Ircing
    EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [4] System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
    Psutka, Josef
    Svec, Jan
    Psutka, Josef V.
    Vanek, Jan
    Prazak, Ales
    Smidl, Lubos
    Ircing, Pavel
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 11
  • [5] OPTIMISING FIGURE OF MERIT FOR PHONETIC SPOKEN TERM DETECTION
    Wallace, Roy
    Vogt, Robbie
    Baker, Brendan
    Sridharan, Sridha
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5298 - 5301
  • [6] Query-By-Example Spoken Term Detection Using Phonetic Posteriorgram Templates
    Hazen, Timothy J.
    Shen, Wade
    White, Christopher
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 421 - +
  • [7] Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram
    Song, Beili
    Zhang, Wei-Qiang
    Cai, Meng
    Liu, Jia
    Johnson, Michael T.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND COMPUTING TECHNOLOGY, 2015, 30 : 1255 - 1260
  • [8] Audio Mining: Unsupervised Spoken Term Detection over an Audio Database
    Kumar, Kishore R.
    Sarkar, Sandipan
    Rengaswamy, Pradeep
    Rao, K. Sreenivasa
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 514 - 518
  • [9] Discriminative Optimization of the Figure of Merit for Phonetic Spoken Term Detection
    Wallace, Roy
    Baker, Brendan
    Vogt, Robbie
    Sridharan, Sridha
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1677 - 1687
  • [10] Using textual information from LVCSR transcripts for phonetic-based spoken term detection
    Dubois, Corentin
    Charlet, Delphine
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4961 - 4964