Improving the accuracy of the speech synthesis based phonetic alignment using multiple acoustic features

被引:0
|
作者
Paulo, S [1 ]
Oliveira, LC [1 ]
机构
[1] IST, INESC ID, Spoken Language Syst Lab, P-1000029 Lisbon, Portugal
来源
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANAGUAGE, PROCEEDINGS | 2003年 / 2721卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.
引用
收藏
页码:31 / 39
页数:9
相关论文
共 50 条
  • [21] ACOUSTIC-PHONETIC FEATURES OF STRESSED SYLLABLES IN SPEECH OF 3 YEAR OLDS
    HAWKINS, S
    ALLEN, G
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 63 : S56 - S56
  • [22] Phonetic Speech Segmentation of Audiobooks by Using Adapted LSTM-Based Acoustic Models
    Hanzlicek, Zdenek
    Matousek, Jindrich
    ADVANCES IN ARTIFICIAL INTELLIGENCE-IBERAMIA 2022, 2022, 13788 : 317 - 327
  • [23] Incorporating finer acoustic phonetic features in lexicon for Hindi language speech recognition
    Patil, Atul
    More, Prashant
    Sasikumar, M.
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2019, 40 (08): : 1731 - 1739
  • [24] Accuracy of HMM-Based Phonetic Segmentation Using Monophone or Triphone Acoustic Model
    Mizera, Petr
    Pollak, Petr
    2013 INTERNATIONAL CONFERENCE ON APPLIED ELECTRONICS (AE), 2013, : 181 - 184
  • [25] STUDY OF ACOUSTIC FEATURES OF WORD JUNCTURE USING SPEECH ANALYSIS AND SYNTHESIS
    NAKATANI, LH
    DUKES, KD
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S4 - S5
  • [26] Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
    Neekhara, Paarth
    Hussain, Shehzeen
    Ghosh, Subhankar
    Li, Jason
    Ginsburg, Boris
    INTERSPEECH 2024, 2024, : 3425 - 3429
  • [27] An Acoustic-Phonetic-Based Speaker Adaptation Technique for Improving Speaker-Independent Continuous Speech Recognition
    Zhao, Yunxin
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (03): : 380 - 394
  • [28] An analysis of general acoustic-phonetic features for Spanish speech produced with the Lombard effect
    Castellanos, A
    Benedi, JM
    Casacuberta, F
    SPEECH COMMUNICATION, 1996, 20 (1-2) : 23 - 35
  • [29] Auditory processing-based features for improving speech recognition in adverse acoustic conditions
    Maganti, Hari Krishna
    Matassoni, Marco
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [30] Auditory processing-based features for improving speech recognition in adverse acoustic conditions
    Hari Krishna Maganti
    Marco Matassoni
    EURASIP Journal on Audio, Speech, and Music Processing, 2014