Maximum Likelihood Voice Conversion Based on GMM with STRAIGHT Mixed Excitation

被引:0
|
作者
Ohtani, Yamato [1 ]
Toda, Tomoki [1 ]
Saruwatari, Hiroshi [1 ]
Shikano, Kiyohiro [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan
关键词
Speech synthesis; Voice conversion; Gaussian mixture model; STRAIGHT; Mixed excitation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of voice conversion has been considerably improved through statistical modeling of spectral sequences. However, the converted speech still contains traces of artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a framework of the voice conversion based on a Gaussian Mixture Model (GMM) on joint probability density of source and target features. We convert both spectral and source feature sequences based on Maximum Likelihood Estimation (MLE). Objective and subjective evaluation results demonstrate that the proposed source conversion produces strong improvements in both the converted speech quality and the conversion accuracy for speaker individuality.
引用
收藏
页码:2266 / 2269
页数:4
相关论文
共 50 条
  • [1] Voice Conversion Based on STRAIGHT and UBM-GMM
    Gao Yingying
    Zhu Weibin
    PROCEEDINGS OF 2009 CONFERENCE ON COMMUNICATION FACULTY, 2009, : 342 - 345
  • [2] Modified method for voice conversion based on GMM
    Shen, Yi
    Jian, Zhi-Hua
    Yang, Zhen
    Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science), 2007, 27 (05): : 11 - 15
  • [3] Voice Conversion Based on Hybrid SVR and GMM
    Song, Peng
    Jin, Yun
    Zhao, Li
    Zou, Cairong
    ARCHIVES OF ACOUSTICS, 2012, 37 (02) : 143 - 149
  • [4] Parameter Estimation for α-GMM Based on Maximum Likelihood Criterion
    Wu, Dalei
    NEURAL COMPUTATION, 2009, 21 (06) : 1776 - 1795
  • [5] Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    Toda, Tomoki
    Black, Alan W.
    Tokuda, Keiichi
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08): : 2222 - 2235
  • [6] Improving the Performance of GMM Based Voice Conversion Method
    Song, Peng
    Zhao, Li
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 436 - 440
  • [7] A GMM based residual prediction method for voice conversion
    Xia, J
    Yin, JX
    ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 389 - 392
  • [8] Low-Delay Voice Conversion based on Maximum Likelihood Estimation of Spectral Parameter Trajectory
    Muramatsu, Takashi
    Ohtani, Yamato
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1076 - 1079
  • [9] HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus
    Lee, Ki-Seung
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (12): : 3064 - 3067
  • [10] Voice Conversion Based on Improved GMM and Spectrum with Synchronous Prosody
    Zhang Bing
    Yu Yibiao
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 659 - 662