USING RHYTHMIC FEATURES FOR JAPANESE SPOKEN TERM DETECTION

被引:0
|
作者
Kanda, Naoyuki [1 ]
Takeda, Ryu [1 ]
Obuchi, Yasunari [1 ]
机构
[1] Hitachi Ltd, Cent Res Lab, Kokubunji, Tokyo 1858601, Japan
关键词
spoken term detection; spoken document retrieval; utterance verification; speech recognition;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A new rescoring method for spoken term detection (STD) is proposed. Phoneme-based close-matching techniques have been used because of their ability to detect out-of-vocabulary (OOV) queries. To improve the accuracy of phoneme-based techniques, rescoring techniques have been used to accurately re-rank the results from phoneme-based close-matching; however, conventional rescoring techniques based on an utterance verification model still produce many false detection results. To further improve the accuracy, in this study, several features representing the "naturalness" (or "abnormality") of duration of phonemes/syllables in detected candidates of a keyword are proposed. These features are incorporated into a conventional rescoring technique using logistic regression. Experimental results with a 604-hour Japanese speech corpus indicated that combining the rhythmic features achieved a further relative error reduction of 8.9% compared to a conventional rescoring technique.
引用
收藏
页码:170 / 175
页数:6
相关论文
共 50 条
  • [1] Query-by-Example Spoken Term Detection Using Bessel Features
    Vasudev, Drisya
    Gangashetty, Suryakanth V.
    Babu, Anish K. K.
    Riyas, K. S.
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [2] Constructing Japanese Test Collections for Spoken Term Detection
    Itoh, Yoshiaki
    Nishizaki, Hiromitsu
    Hu, Xinhui
    Nanjo, Hiroaki
    Akiba, Tomoyosi
    Kawahara, Tatsuya
    Nakagawa, Seiichi
    Matsui, Tomoko
    Yamashita, Yoichi
    Aikawa, Kiyoaki
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 677 - +
  • [3] MULTILINGUAL BOTTLENECK FEATURES FOR QUERY BY EXAMPLE SPOKEN TERM DETECTION
    Ram, Dhananjay
    Miculicich, Lesly
    Bourlard, Herve
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 621 - 628
  • [4] Augmented set of features for confidence estimation in spoken term detection
    Tejedor, Javier
    Toledano, Doroteo T.
    Bautista, Miguel
    King, Simon
    Wang, Dong
    Colas, Jose
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 701 - +
  • [5] A Fast Approach to Spoken Term Detection Based on Prosodic Dynamic Features
    Tan, Xuejiao
    Wang, Lei
    PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 593 - 596
  • [6] Phonetic subspace features for improved query by example spoken term detection
    Ram, Dhananjay
    Asaei, Afsaneh
    Bourlard, Herve
    SPEECH COMMUNICATION, 2018, 103 : 27 - 36
  • [7] Using Conversational Word Bursts in Spoken Term Detection
    Chiu, Justin
    Rudnicky, Alexander
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2246 - 2250
  • [8] EFFICIENT SPOKEN TERM DETECTION USING CONFUSION NETWORKS
    Mangu, Lidia
    Kingsbury, Brian
    Soltau, Hagen
    Kuo, Hong-Kwang
    Picheny, Michael
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Spoken Term Detection Using Visual Spectrogram Matching
    Lazic, Nevena
    Aarabi, Parham
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 637 - 642
  • [10] SPOKEN TERM DETECTION USING FAST PHONETIC DECODING
    Wallace, Roy
    Vogt, Robbie
    Sridharan, Sridha
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4881 - 4884