USING RHYTHMIC FEATURES FOR JAPANESE SPOKEN TERM DETECTION

被引：0

作者：

Kanda, Naoyuki ^{[1
]}

Takeda, Ryu ^{[1
]}

Obuchi, Yasunari ^{[1
]}

机构：

[1] Hitachi Ltd, Cent Res Lab, Kokubunji, Tokyo 1858601, Japan

来源：

2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012) | 2012年

关键词：

spoken term detection; spoken document retrieval; utterance verification; speech recognition;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A new rescoring method for spoken term detection (STD) is proposed. Phoneme-based close-matching techniques have been used because of their ability to detect out-of-vocabulary (OOV) queries. To improve the accuracy of phoneme-based techniques, rescoring techniques have been used to accurately re-rank the results from phoneme-based close-matching; however, conventional rescoring techniques based on an utterance verification model still produce many false detection results. To further improve the accuracy, in this study, several features representing the "naturalness" (or "abnormality") of duration of phonemes/syllables in detected candidates of a keyword are proposed. These features are incorporated into a conventional rescoring technique using logistic regression. Experimental results with a 604-hour Japanese speech corpus indicated that combining the rhythmic features achieved a further relative error reduction of 8.9% compared to a conventional rescoring technique.

引用

页码：170 / 175

页数：6

共 50 条

[1] Query-by-Example Spoken Term Detection Using Bessel Features
Vasudev, Drisya
Gangashetty, Suryakanth V.
Babu, Anish K. K.
Riyas, K. S.
2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
[2] Constructing Japanese Test Collections for Spoken Term Detection
Itoh, Yoshiaki
Nishizaki, Hiromitsu
Hu, Xinhui
Nanjo, Hiroaki
Akiba, Tomoyosi
Kawahara, Tatsuya
Nakagawa, Seiichi
Matsui, Tomoko
Yamashita, Yoichi
Aikawa, Kiyoaki
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 677 - +
[3] MULTILINGUAL BOTTLENECK FEATURES FOR QUERY BY EXAMPLE SPOKEN TERM DETECTION
Ram, Dhananjay
Miculicich, Lesly
Bourlard, Herve
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 621 - 628
[4] Augmented set of features for confidence estimation in spoken term detection
Tejedor, Javier
Toledano, Doroteo T.
Bautista, Miguel
King, Simon
Wang, Dong
Colas, Jose
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 701 - +
[5] A Fast Approach to Spoken Term Detection Based on Prosodic Dynamic Features
Tan, Xuejiao
Wang, Lei
PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 593 - 596
[6] Phonetic subspace features for improved query by example spoken term detection
Ram, Dhananjay
Asaei, Afsaneh
Bourlard, Herve
SPEECH COMMUNICATION, 2018, 103 : 27 - 36
[7] Using Conversational Word Bursts in Spoken Term Detection
Chiu, Justin
Rudnicky, Alexander
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2246 - 2250
[8] EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS
Mangu, Lidia
Kingsbury, Brian
Soltau, Hagen
Kuo, Hong-Kwang
Picheny, Michael
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Spoken Term Detection Using Visual Spectrogram Matching
Lazic, Nevena
Aarabi, Parham
ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 637 - 642
[10] SPOKEN TERM DETECTION USING FAST PHONETIC DECODING
Wallace, Roy
Vogt, Robbie
Sridharan, Sridha
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4881 - 4884

← 1 2 3 4 5 →