USING RHYTHMIC FEATURES FOR JAPANESE SPOKEN TERM DETECTION

被引：0

作者：

Kanda, Naoyuki ^{[1
]}

Takeda, Ryu ^{[1
]}

Obuchi, Yasunari ^{[1
]}

机构：

[1] Hitachi Ltd, Cent Res Lab, Kokubunji, Tokyo 1858601, Japan

来源：

2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012) | 2012年

关键词：

spoken term detection; spoken document retrieval; utterance verification; speech recognition;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A new rescoring method for spoken term detection (STD) is proposed. Phoneme-based close-matching techniques have been used because of their ability to detect out-of-vocabulary (OOV) queries. To improve the accuracy of phoneme-based techniques, rescoring techniques have been used to accurately re-rank the results from phoneme-based close-matching; however, conventional rescoring techniques based on an utterance verification model still produce many false detection results. To further improve the accuracy, in this study, several features representing the "naturalness" (or "abnormality") of duration of phonemes/syllables in detected candidates of a keyword are proposed. These features are incorporated into a conventional rescoring technique using logistic regression. Experimental results with a 604-hour Japanese speech corpus indicated that combining the rhythmic features achieved a further relative error reduction of 8.9% compared to a conventional rescoring technique.

引用

页码：170 / 175

页数：6

共 50 条

[31] Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection
Chung, Cheng-Tao
Tsai, Cheng-Yu
Liu, Chia-Hsiang
Lee, Lin-Shan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) : 1914 - 1928
[32] Combined MFCC-FBCC Features for Unsupervised Query-by-Example Spoken Term Detection
Vasudev, Drisya
Vasudev, Suryakanth V.
Babu, K. K. Anish
Riyas, K. S.
INTELLIGENT SYSTEMS TECHNOLOGIES AND APPLICATIONS, VOL 1, 2016, 384 : 511 - 519
[33] Model-Based Unsupervised Spoken Term Detection with Spoken Queries
Chan, Chun-an
Lee, Lin-shan
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07): : 1330 - 1342
[34] Unsupervised Spoken-Term Detection with Spoken Queries Using Segment-based Dynamic Time Warping
Chan, Chun-an
Lee, Lin-Shan
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 693 - 696
[35] Query-By-Example Spoken Term Detection Using Phonetic Posteriorgram Templates
Hazen, Timothy J.
Shen, Wade
White, Christopher
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 421 - +
[36] A Rescoring Method Using Web Search and Word Vectors for Spoken Term Detection
Tanji, Haruka
Kojima, Kazunori
Nanjo, Hiroaki
Lee, Shi-wook
Itoh, Yoshiaki
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1163 - 1167
[37] Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework
Vavruska, Jan
Svec, Jan
Ircing, Pavel
TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 402 - 409
[38] Query-By-Example Spoken Term Detection Using Generative Adversarial Network
Shah, Neil
Sreeraj, R.
Madhavi, Maulik C.
Shah, Nirmesh J.
Patil, Hemant A.
2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 644 - 648
[39] Query-by-Example Spoken Term Detection using Attentive Pooling Networks
Zhang, Kun
Wu, Zhiyong
Jia, Jia
Meng, Helen
Song, Binheng
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1267 - 1272
[40] Spoken Term Detection of Zero-Resource Language using Machine Learning
Ito, Akinori
Koizumi, Masatoshi
2018 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY (ICIIT 2018), 2018, : 45 - 49

← 1 2 3 4 5 →