Maximum Likelihood Voice Conversion Based on GMM with STRAIGHT Mixed Excitation

被引：0

作者：

Ohtani, Yamato ^{[1
]}

Toda, Tomoki ^{[1
]}

Saruwatari, Hiroshi ^{[1
]}

Shikano, Kiyohiro ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara, Japan

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

Speech synthesis; Voice conversion; Gaussian mixture model; STRAIGHT; Mixed excitation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of voice conversion has been considerably improved through statistical modeling of spectral sequences. However, the converted speech still contains traces of artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a framework of the voice conversion based on a Gaussian Mixture Model (GMM) on joint probability density of source and target features. We convert both spectral and source feature sequences based on Maximum Likelihood Estimation (MLE). Objective and subjective evaluation results demonstrate that the proposed source conversion produces strong improvements in both the converted speech quality and the conversion accuracy for speaker individuality.

引用

页码：2266 / 2269

页数：4

共 50 条

[1] Voice Conversion Based on STRAIGHT and UBM-GMM
Gao Yingying
Zhu Weibin
PROCEEDINGS OF 2009 CONFERENCE ON COMMUNICATION FACULTY, 2009, : 342 - 345
[2] Modified method for voice conversion based on GMM
Shen, Yi
Jian, Zhi-Hua
Yang, Zhen
Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science), 2007, 27 (05): : 11 - 15
[3] Voice Conversion Based on Hybrid SVR and GMM
Song, Peng
Jin, Yun
Zhao, Li
Zou, Cairong
ARCHIVES OF ACOUSTICS, 2012, 37 (02) : 143 - 149
[4] Parameter Estimation for α-GMM Based on Maximum Likelihood Criterion
Wu, Dalei
NEURAL COMPUTATION, 2009, 21 (06) : 1776 - 1795
[5] Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
Toda, Tomoki
Black, Alan W.
Tokuda, Keiichi
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08): : 2222 - 2235
[6] Improving the Performance of GMM Based Voice Conversion Method
Song, Peng
Zhao, Li
PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 436 - 440
[7] A GMM based residual prediction method for voice conversion
Xia, J
Yin, JX
ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 389 - 392
[8] Low-Delay Voice Conversion based on Maximum Likelihood Estimation of Spectral Parameter Trajectory
Muramatsu, Takashi
Ohtani, Yamato
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1076 - 1079
[9] HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus
Lee, Ki-Seung
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (12): : 3064 - 3067
[10] Voice Conversion Based on Improved GMM and Spectrum with Synchronous Prosody
Zhang Bing
Yu Yibiao
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 659 - 662

← 1 2 3 4 5 →