Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network
被引:6
|
作者:
Ho, Tuan Vu
论文数: 0引用数: 0
h-index: 0
机构:
Japan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi 9231292, JapanJapan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi 9231292, Japan
Ho, Tuan Vu
[1
]
Akagi, Masato
论文数: 0引用数: 0
h-index: 0
机构:
Japan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi 9231292, JapanJapan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi 9231292, Japan
Akagi, Masato
[1
]
机构:
[1] Japan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi 9231292, Japan
This paper proposes a non-parallel cross-lingual voice conversion (CLVC) model that can mimic voice while continuously controlling speaker individuality on the basis of the variational autoencoder (VAE) and star generative adversarial network (StarGAN). Most studies on CLVC only focused on mimicking a particular speaker voice without being able to arbitrarily modify the speaker individuality. In practice, the ability to generate speaker individuality may be more useful than just mimicking voice. Therefore, the proposed model reliably extracts the speaker embedding from different languages using a VAE. An F0 injection method is also introduced into our model to enhance the F0 modeling in the cross-lingual setting. To avoid the over-smoothing degradation problem of the conventional VAE, the adversarial training scheme of the StarGAN is adopted to improve the training-objective function of the VAE in a CLVC task. Objective and subjective measurements confirm the effectiveness of the proposed model and F0 injection method. Furthermore, speaker-similarity measurement on fictitious voices reveal a strong linear relationship between speaker individuality and interpolated speaker embedding, which indicates that speaker individuality can be controlled with our proposed model.
机构:
Sun Yat sen Univ, Sch Agr, Guangzhou 511436, Peoples R China
Sun Yat Sen Univ, Inst Entomol, Life Sci Sch, Guangzhou 510275, Peoples R ChinaGuangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China
Jia, Fenglong
Yang, Xiaomei
论文数: 0引用数: 0
h-index: 0
机构:
Guangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R ChinaGuangdong Univ Technol, Sch Civil & Transportat Engn, Guangzhou 510006, Peoples R China