Maximum Likelihood Voice Conversion Based on GMM with STRAIGHT Mixed Excitation

被引:0
|
作者
Ohtani, Yamato [1 ]
Toda, Tomoki [1 ]
Saruwatari, Hiroshi [1 ]
Shikano, Kiyohiro [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan
关键词
Speech synthesis; Voice conversion; Gaussian mixture model; STRAIGHT; Mixed excitation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of voice conversion has been considerably improved through statistical modeling of spectral sequences. However, the converted speech still contains traces of artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a framework of the voice conversion based on a Gaussian Mixture Model (GMM) on joint probability density of source and target features. We convert both spectral and source feature sequences based on Maximum Likelihood Estimation (MLE). Objective and subjective evaluation results demonstrate that the proposed source conversion produces strong improvements in both the converted speech quality and the conversion accuracy for speaker individuality.
引用
收藏
页码:2266 / 2269
页数:4
相关论文
共 50 条
  • [31] Voice conversion based on joint pitch and spectral transformation with component - Group-GMM
    Ma, JC
    Liu, WJ
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 199 - 203
  • [32] AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
    Zhou Ying Zhang LinghuaCollege of Telecommunications Information EngineeringNanjing University of Posts and Telecommunications Nanjing China
    Journal of Electronics(China), 2011, 28(Z1) (China) : 518 - 523
  • [33] Performance of new voice conversion systems based on GMM models and applied to Arabic language
    Guerid, A.
    Houacine, A.
    Andre-Obrecht, R.
    Lachambre, H.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (04) : 477 - 485
  • [34] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
    Sun, Xinjian
    Zhang, Xiongwei
    Yang, Jibin
    Cao, Tieyong
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
  • [35] Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
    Tanaka, Kei
    Hara, Sunao
    Abe, Masanobu
    Minagi, Shogo
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [36] AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
    Zhou Ying Zhang Linghua(College of Telecommunications & Information Engineering
    Journal of Electronics(China), 2011, (Z1) : 518 - 523
  • [37] An improved spectral and prosodic transformation method in straight-based voice conversion
    Qin, L
    Chen, GP
    Ling, ZH
    Dai, LR
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 21 - 24
  • [38] Alleviating the Over-Smoothing Problem in GMM-Based Voice Conversion with Discriminative Training
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3061 - 3065
  • [39] Novel Inter Mixture Weighted GMM Posteriorgram for DNN and GAN-based Voice Conversion
    Shah, Nirmesh J.
    Sreeraj, R.
    Shah, Neil
    Patil, Hemant A.
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1776 - 1781
  • [40] Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    SPEECH COMMUNICATION, 2012, 54 (01) : 134 - 146