A Hybrid Text-to-Speech Synthesis using Vowel and Non Vowel like regions

被引:0
|
作者
Adiga, Nagaraj [1 ]
Prasanna, S. R. Mahadeva [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati, India
关键词
speech synthesis; unit selection; hybrid TTS; HTS; VLRs and NVLRs; EPOCH EXTRACTION; SELECTION; SYSTEM;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a hybrid Text-to-Speech synthesis (TTS) approach by combining advantages present in both Hidden Markov model speech synthesis (HTS) and Unit selection speech synthesis (USS). In hybrid TTS, speech sound units are classified into vowel like regions (VLRs) and non vowel like regions (NVLRs) for selecting the units. The VLRs here refers to vowel, diphthong, semivowel and nasal sound units [1], which can be better modeled from HMM framework and hence waveforms units are chosen from HTS. Remaining sound units such as stop consonants, fricatives and affricates, which are not modeled properly using HMM [2] are classified as NVLRs and for these phonetic classes natural sound units are picked from USS. The VLRs and NVLRs evidence obtained from manual and automatic segmentation of speech signal. The automatic detection is done by fusing source features obtained from Hilbert envelope (HE) and Zero frequency filter (ZFF) of speech signal. Speech synthesized from manual and automated hybrid TTS method is compared with HTS and USS voice using subjective and objective measures. Results show that synthesis quality of hybrid TTS in case of manual segmentation is better compared to HTS voice, whereas automatic segmentation has slightly inferior quality.
引用
收藏
页数:5
相关论文
共 50 条
  • [11] High quality text-to-speech synthesis system with efficient duration models developed using coding schemes based on vowel production characteristics
    Reddy, V. Ramu
    Rao, K. Sreenivasa
    2013 13TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2013, : 7 - 12
  • [12] Towards a Vowel Formant Based Quality Metric for Text-to-Speech Systems: Measuring Monophthong Naturalness
    Albrecht, Sven
    Tamboli, Rewa
    Taubert, Stefan
    Eibl, Maximilian
    Diaeresis, Gunter
    Schmied, Josef
    2022 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (IEEE CIVEMSA 2022), 2022,
  • [13] Speech Synthesis of Emotions in a Sentence Using Vowel Features
    Makino, Rintaro
    Yoshitomi, Yasunari
    Asada, Taro
    Tabuse, Masayoshi
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 403 - 406
  • [14] Speech synthesis of emotions using vowel features of a speaker
    Boku, K.
    Asada, T.
    Yoshitomi, Y.
    Tabuse, M.
    PROCEEDINGS OF THE EIGHTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 18TH '13), 2013, : 176 - 179
  • [15] Speech Synthesis of Emotions in a Sentence using Vowel Features
    Makino, Rintaro
    Yoshitomi, Yasunari
    Asada, Taro
    Tabuse, Masayoshi
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2020, 7 (02): : 107 - 110
  • [16] Speech synthesis of emotions using vowel features of a speaker
    Boku, Kanu
    Asada, Taro
    Yoshitomi, Yasunari
    Tabuse, Masayoshi
    ARTIFICIAL LIFE AND ROBOTICS, 2014, 19 (01) : 27 - 32
  • [17] An efficient network for farsi text to speech conversion using vowel state
    Rasekh, Ehsan
    Eshghi, Mohammad
    TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 176 - +
  • [18] An efficient hardware architecture for detection of vowel-like regions in speech signal
    Srinivas, Nagapuri
    Pradhan, Gayadhar
    Kumar, Puli Kishore
    INTEGRATION-THE VLSI JOURNAL, 2018, 63 : 185 - 195
  • [19] TEXT-TO-SPEECH SYNTHESIS
    SPROAT, RW
    OLIVE, JP
    AT&T TECHNICAL JOURNAL, 1995, 74 (02): : 35 - 44
  • [20] Facial Expression Synthesis Using Vowel Recognition for Synthesized Speech
    Asada, Taro
    Adachi, Ruka
    Takada, Syuhei
    Yoshitomi, Yasunari
    Tabuse, Masayoshi
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 398 - 401