Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework

被引：0

作者：

Vavruska, Jan ^{[1
]}

Svec, Jan ^{[1
]}

Ircing, Pavel ^{[1
]}

机构：

[1] Univ W Bohemia, Dept Cybernet, Plzen 30614, Czech Republic

来源：

TEXT, SPEECH, AND DIALOGUE, TSD 2013 | 2013年 / 8082卷

关键词：

spoken term detection; finite state automata;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper presents a technique for phonetic spoken term detection in large audio archive. It is designed within the framework of weighted finite-state transducers and utilizes the rather recently developed notion of factor automata, which we have enhanced with a score normalization and a technique for systematic query expansion which allows for phone deletions and substitutions and consequently compensates for frequent pronunciation imperfections and systematic phoneme interchanges occurring during the ASR decoding process. The experiments presented in the paper show that the new WFST-based method outperforms the baseline system both in terms of search performance and speed. Finally, the paper discusses the issues of the proposed techniques that need to be addressed before the application in real-life tasks.

引用

页码：402 / 409

页数：8

共 50 条

[21] USING RHYTHMIC FEATURES FOR JAPANESE SPOKEN TERM DETECTION
Kanda, Naoyuki
Takeda, Ryu
Obuchi, Yasunari
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 170 - 175
[22] EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS
Mangu, Lidia
Kingsbury, Brian
Soltau, Hagen
Kuo, Hong-Kwang
Picheny, Michael
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[23] Spoken Term Detection Using Visual Spectrogram Matching
Lazic, Nevena
Aarabi, Parham
ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 637 - 642
[24] Selection of Best Match Keyword using Spoken Term Detection for Spoken Document Indexing
Domoto, Kentaro
Utsuro, Takehito
Sawada, Naoki
Nishizaki, Hiromitsu
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[25] INTEGRATING RECOGNITION AND RETRIEVAL WITH USER FEEDBACK: A NEW FRAMEWORK FOR SPOKEN TERM DETECTION
Lee, Hung-yi
Lee, Lin-shan
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5290 - 5293
[26] Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task
Akiba, Tomoyosi
Nishizaki, Hiromitsu
Aikawa, Kiyoaki
Kawahara, Tatsuya
Matsui, Tomoko
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3527 - 3534
[27] Emotion Classification of Spontaneous Speech Using Spoken Term Detection
Nishizaki, Hiromitsu
Watase, Kei
2017 IEEE 6TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2017,
[28] Evaluation of Fast Spoken Term Detection Using a Suffix Array
Katsurada, Kouichi
Sawada, Shinta
Teshima, Shigeki
Iribe, Yurie
Nitta, Tsuneo
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 916 - 919
[29] Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection
Madhavi, Maulik C.
Patil, Hemant A.
COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 175 - 202
[30] Speaker Verification and Spoken Language Identification using a Generalized I-vector Framework with Phonetic Tokenizations and Tandem Features
Li, Ming
Liu, Wenbo
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1120 - 1124

← 1 2 3 4 5 →