Study on the consistency analysis between the prosody and the spectrum for Mandarin speech

被引:1
|
作者
Yeh, Cheng-Yu [1 ]
Chen, Kuan-Lin [2 ]
Hwang, Shaw-Hwa [2 ]
Yan, Long-Jhe [2 ]
机构
[1] Natl Chin Yi Univ Technol, Dept Elect Engn, Taichung 41170, Taiwan
[2] Natl Taipei Univ Technol, Dept Elect Engn, Taipei 10608, Taiwan
关键词
INFORMATION; CONVERSION; ALGORITHM; SYSTEM;
D O I
10.1049/iet-spr.2012.0099
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this work, a consistency analysis between the prosody and the spectrum for Mandarin speech is presented. Found by an inspection on the pronunciation process of human beings, the consistency can be interpreted as a close correlated relation of a warping curve between the spectrum and the prosody intra a syllable. Through three steps in the procedure of the consistency analysis, the hidden Markov model (HMM) algorithm is used firstly to decode HMM-state sequences within a syllable at the same time as to divide them into three segments. Secondly, based on a designated syllable, the vector quantisation (VQ) with the Linde-Buzo-Gray algorithm is used to train the VQ codebooks of each segment. Thirdly, the prosodic vector of each segment is encoded as an index by VQ codebooks, and then the probability of each possible path is evaluated as a prerequisite to analyse the consistency. It is demonstrated experimentally that a consistency is definitely acquired in case the syllable is located exactly in the same word. These results offer a research direction that the warping process between the spectrum and the prosody intra a syllable must be considered in a text-to-speech system to improve the speech quality.
引用
收藏
页码:158 / 165
页数:8
相关论文
共 50 条
  • [21] A New Model-based Prosody Coder for Mandarin Speech
    Chiang, Chen-Yu
    Hung, Yu-Ping
    Chen, Sin-Horng
    Wang, Yih-Ru
    2013 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2013), 2013, : 60 - 63
  • [22] Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition
    Chiu, Tzu-Hsuan
    Chiang, Chen-Yu
    Liao, Yuan-Fu
    Yang, Jyh-Her
    Wang, Yih-Ru
    Chen, Sin-Horng
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 139 - 142
  • [23] A New Approach of Speaking Rate Modeling for Mandarin Speech Prosody
    Hsieh, Chiao-Hua
    Chiang, Chen-Yu
    Wang, Yih-Ru
    Yu, Hsiu-Min
    Chen, Sin-Horng
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 654 - 657
  • [24] Acquisition and Interpretation of Mandarin Speech Prosody by Native Speakers and Cantonese Learners
    Chen, Xi
    Chen, Si
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1800 - 1809
  • [25] Advanced Unsupervised Joint Prosody Labeling and Modeling for Mandarin Speech and Its Application to Prosody Generation for TTS
    Chiang, Chen-Yu
    Chen, Sin-Horng
    Wang, Yih-Ru
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 500 - 503
  • [26] A parametric prosody coding approach for Mandarin speech using a hierarchical prosodic model
    Chiang, Chen-Yu
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
  • [27] Prosody Conversion for Emotional Mandarin Speech Synthesis Using the Tone Nucleus Model
    Wen, Miaomiao
    Wang, Miaomiao
    Hirose, Keikichi
    Minematsu, Nobuaki
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2808 - +
  • [28] Investigating Acoustic Cues of Emotional Valence in Mandarin Speech Prosody - A Corpus Approach
    Li, Junlin
    Huang, Chu-Ren
    CHINESE LEXICAL SEMANTICS, CLSW 2023, PT II, 2024, 14515 : 316 - 330
  • [29] Prosody model in a Mandarin Text-to-Speech System based on a hierarchical approach
    Pan, NH
    Jen, WT
    Yu, SS
    Yu, MS
    Huang, SY
    Wu, MJ
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 448 - 451
  • [30] High-quality prosody generation in Mandarin text-to-speech system
    Guo, Qing
    Zhang, Jie
    Katae, Nobuyuki
    Yu, Hao
    Fujitsu Scientific and Technical Journal, 2010, 46 (01): : 40 - 46