Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion

被引:0
|
作者
Amini, Jamal [1 ]
Shahrebabaki, Abdoreza Sabzi [1 ]
Shokouhi, Navid [1 ]
Sheikhzadeh, Hamid [1 ]
Raahemifa, Kaamran [2 ]
Eslami, Mehdi [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran
[2] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada
关键词
Analysis/Synthesis; Feature Extraction; Voice Conversion; GMM; STRAIGHT; FREQUENCY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Voice conversion typically employs spectral features to convert a source voice to a target voice. In this paper, we propose a simple method of fitting the STRAIGHT spectrum with Gaussian mixture (GM) models for speech analysis/synthesis and spectral modification. The mean values of the Gaussians are pre-determined based on Mel-frequency spacing. The standard deviations are also adaptively adjusted using the constant-Q principle and the spectrum amplitudes. Finally, the weights of the Gaussians are determined by sampling the log-spectrum at Mel-frequencies. The proposed analysis/synthesis method (MFLS-GM) is employed for speech analysis/synthesis and voice conversion. Subjective evaluations employing MOS and ABX demonstrate superior performance of the voice conversion using the MFLS-GM compared to systems employing MFCC features. The computation cost of the proposed analysis/synthesis method is also much lower than those based on MFCC.
引用
收藏
页码:428 / 433
页数:6
相关论文
共 50 条
  • [21] Voice characteristics conversion for HMM-based speech synthesis system
    Masuko, T
    Tokuda, K
    Kobayashi, T
    Imai, S
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1611 - 1614
  • [22] VOICE COMMUNICATION BY SPEECH SYNTHESIS
    WARMUTH, DB
    MUNDIE, JR
    VAUGHN, GL
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 1976, 12 (03) : 430 - 430
  • [23] Emotional speech synthesis based on improved codebook mapping voice conversion
    Wang, YP
    Ling, ZH
    Wang, RH
    AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PROCEEDINGS, 2005, 3784 : 374 - 381
  • [24] High-Individuality Voice Conversion Based on Concatenative Speech Synthesis
    Fujii, Kei
    Okawa, Jun
    Suigetsu, Kaori
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 483 - 488
  • [25] Evaluation of Expressive Speech Synthesis With Voice Conversion and Copy Resynthesis Techniques
    Turk, Oytun
    Schroeder, Marc
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 965 - 973
  • [26] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Povey, Daniel
    Burget, Lukas
    Agarwal, Mohit
    Akyazi, Pinar
    Feng, Kai
    Ghoshal, Arnab
    Glembek, Ondrej
    Goel, Nagendra Kumar
    Karafiat, Martin
    Rastrow, Ariya
    Rose, Richard C.
    Schwarz, Petr
    Thomas, Samuel
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333
  • [27] Gaussian mixture language models for speech recognition
    Afify, Mohamed
    Siohan, Olivier
    Sarikaya, Ruhi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 29 - +
  • [28] Boosted Mixture Learning of Gaussian Mixture HMMs for Speech Recognition
    Du, Jun
    Hu, Yu
    Jiang, Hui
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2942 - +
  • [29] Iteratively Improving Speech Recognition and Voice Conversion
    Singh, Mayank Kumar
    Takahashi, Naoya
    Onoe, Naoyuki
    INTERSPEECH 2023, 2023, : 206 - 210
  • [30] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,