Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework

被引：0

作者：

Vavruska, Jan ^{[1
]}

Svec, Jan ^{[1
]}

Ircing, Pavel ^{[1
]}

机构：

[1] Univ W Bohemia, Dept Cybernet, Plzen 30614, Czech Republic

来源：

TEXT, SPEECH, AND DIALOGUE, TSD 2013 | 2013年 / 8082卷

关键词：

spoken term detection; finite state automata;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper presents a technique for phonetic spoken term detection in large audio archive. It is designed within the framework of weighted finite-state transducers and utilizes the rather recently developed notion of factor automata, which we have enhanced with a score normalization and a technique for systematic query expansion which allows for phone deletions and substitutions and consequently compensates for frequent pronunciation imperfections and systematic phoneme interchanges occurring during the ASR decoding process. The experiments presented in the paper show that the new WFST-based method outperforms the baseline system both in terms of search performance and speed. Finally, the paper discusses the issues of the proposed techniques that need to be addressed before the application in real-life tasks.

引用

页码：402 / 409

页数：8

共 50 条

[1] Cross database audio visual speech adaptation for phonetic spoken term detection
Kalantari, Shahram
Dean, David
Sridharan, Sridha
COMPUTER SPEECH AND LANGUAGE, 2017, 44 : 1 - 21
[2] SPOKEN TERM DETECTION USING FAST PHONETIC DECODING
Wallace, Roy
Vogt, Robbie
Sridharan, Sridha
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4881 - 4884
[3] System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
Josef Psutka
Jan Švec
Josef V Psutka
Jan Vaněk
Aleš Pražák
Luboš Šmídl
Pavel Ircing
EURASIP Journal on Audio, Speech, and Music Processing, 2011
[4] System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
Psutka, Josef
Svec, Jan
Psutka, Josef V.
Vanek, Jan
Prazak, Ales
Smidl, Lubos
Ircing, Pavel
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 11
[5] OPTIMISING FIGURE OF MERIT FOR PHONETIC SPOKEN TERM DETECTION
Wallace, Roy
Vogt, Robbie
Baker, Brendan
Sridharan, Sridha
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5298 - 5301
[6] Query-By-Example Spoken Term Detection Using Phonetic Posteriorgram Templates
Hazen, Timothy J.
Shen, Wade
White, Christopher
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 421 - +
[7] Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram
Song, Beili
Zhang, Wei-Qiang
Cai, Meng
Liu, Jia
Johnson, Michael T.
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT AND COMPUTING TECHNOLOGY, 2015, 30 : 1255 - 1260
[8] Audio Mining: Unsupervised Spoken Term Detection over an Audio Database
Kumar, Kishore R.
Sarkar, Sandipan
Rengaswamy, Pradeep
Rao, K. Sreenivasa
2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 514 - 518
[9] Discriminative Optimization of the Figure of Merit for Phonetic Spoken Term Detection
Wallace, Roy
Baker, Brendan
Vogt, Robbie
Sridharan, Sridha
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1677 - 1687
[10] Using textual information from LVCSR transcripts for phonetic-based spoken term detection
Dubois, Corentin
Charlet, Delphine
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4961 - 4964

← 1 2 3 4 5 →