Improving the accuracy of the speech synthesis based phonetic alignment using multiple acoustic features

被引:0
|
作者
Paulo, S [1 ]
Oliveira, LC [1 ]
机构
[1] IST, INESC ID, Spoken Language Syst Lab, P-1000029 Lisbon, Portugal
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.
引用
收藏
页码:31 / 39
页数:9
相关论文
共 50 条
  • [1] Phonetic segmentation using multiple speech features
    Mporas, Iosif
    Ganchev, Todor
    Fakotakis, Nikos
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (02) : 73 - 85
  • [2] PHONETIC FEATURES AND ACOUSTIC INVARIANCE IN SPEECH
    BLUMSTEIN, SE
    STEVENS, KN
    COGNITION, 1981, 10 (1-3) : 25 - 32
  • [3] IMPROVING SPEECH ENHANCEMENT WITH PHONETIC EMBEDDING FEATURES
    Wu, Bo
    Yu, Meng
    Chen, Lianwu
    Jin, Mingjie
    Su, Dan
    Yu, Dong
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 645 - 651
  • [4] Phonetic alignment:: speech synthesis-based vs. Viterbi-based
    Malfrère, F
    Deroo, O
    Dutoit, T
    Ris, C
    SPEECH COMMUNICATION, 2003, 40 (04) : 503 - 515
  • [5] Improving accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues
    Lu, Yue
    Sze, Sing-Hoi
    NUCLEIC ACIDS RESEARCH, 2009, 37 (02) : 463 - 472
  • [6] Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech
    Lee, Jung-Won
    Choi, Jeung-Yoon
    Kang, Hong-Goo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1268 - 1271
  • [7] Acoustic-Phonetic Features for Refining the Explicit Speech Segmentation
    Selmini, Antonio Marcos
    Violaro, Fabio
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1853 - 1856
  • [8] Speech intelligibility of dysarthric speech: human scores and acoustic-phonetic features
    Xue, Wei
    van Hout, Roeland
    Boogmans, Fleur
    Ganzeboom, Mario
    Cucchiarini, Catia
    Strik, Helmer
    INTERSPEECH 2021, 2021, : 2911 - 2915
  • [9] Dialectal Assamese Vowel Speech Detection using Acoustic Phonetic Features, KNN and RNN
    Sharma, Mridusmita
    Sarma, Kandarpa Kumar
    2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 674 - 678
  • [10] Characterizing Parkinson's Disease Speech by Acoustic and Phonetic Features
    Proenca, Jorge
    Veiga, Arlindo
    Candeias, Sara
    Lemos, Joao
    Januario, Cristina
    Perdigao, Fernando
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, 2014, 8775 : 24 - 35