Improving the accuracy of the speech synthesis based phonetic alignment using multiple acoustic features

被引：0

作者：

Paulo, S ^{[1
]}

Oliveira, LC ^{[1
]}

机构：

[1] IST, INESC ID, Spoken Language Syst Lab, P-1000029 Lisbon, Portugal

来源：

COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANAGUAGE, PROCEEDINGS | 2003年 / 2721卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.

引用

页码：31 / 39

页数：9

共 50 条

[1] Phonetic segmentation using multiple speech features
Mporas, Iosif
Ganchev, Todor
Fakotakis, Nikos
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (02) : 73 - 85
[2] PHONETIC FEATURES AND ACOUSTIC INVARIANCE IN SPEECH
BLUMSTEIN, SE
STEVENS, KN
COGNITION, 1981, 10 (1-3) : 25 - 32
[3] IMPROVING SPEECH ENHANCEMENT WITH PHONETIC EMBEDDING FEATURES
Wu, Bo
Yu, Meng
Chen, Lianwu
Jin, Mingjie
Su, Dan
Yu, Dong
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 645 - 651
[4] Phonetic alignment:: speech synthesis-based vs. Viterbi-based
Malfrère, F
Deroo, O
Dutoit, T
Ris, C
SPEECH COMMUNICATION, 2003, 40 (04) : 503 - 515
[5] Improving accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues
Lu, Yue
Sze, Sing-Hoi
NUCLEIC ACIDS RESEARCH, 2009, 37 (02) : 463 - 472
[6] Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech
Lee, Jung-Won
Choi, Jeung-Yoon
Kang, Hong-Goo
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1268 - 1271
[7] Acoustic-Phonetic Features for Refining the Explicit Speech Segmentation
Selmini, Antonio Marcos
Violaro, Fabio
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1853 - 1856
[8] Speech intelligibility of dysarthric speech: human scores and acoustic-phonetic features
Xue, Wei
van Hout, Roeland
Boogmans, Fleur
Ganzeboom, Mario
Cucchiarini, Catia
Strik, Helmer
INTERSPEECH 2021, 2021, : 2911 - 2915
[9] Dialectal Assamese Vowel Speech Detection using Acoustic Phonetic Features, KNN and RNN
Sharma, Mridusmita
Sarma, Kandarpa Kumar
2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 674 - 678
[10] Characterizing Parkinson's Disease Speech by Acoustic and Phonetic Features
Proenca, Jorge
Veiga, Arlindo
Candeias, Sara
Lemos, Joao
Januario, Cristina
Perdigao, Fernando
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, 2014, 8775 : 24 - 35

← 1 2 3 4 5 →