MODEL-MAPPING BASED VOICE CONVERSION SYSTEM A Novel Approach to Improve Voice Similarity and Naturalness using Model-based Speech Synthesis Techniques

被引:0
|
作者
Li, Baojie [1 ]
Wu, Dalei [1 ]
Jiang, Hui [1 ]
机构
[1] York Univ, Dept Comp Sci & Engn, 4700 Keele St, Toronto, ON M3J 1P3, Canada
关键词
Voice conversion; HMM-based speech synthesis; GMM; Model mapping;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this paper we present a novel voice conversion application in which no any knowledge of source speakers is available, but only sufficient utterances from a target speaker and a number of other speakers are in hand. Our approach consists in two separate stages. At the training stage, we estimate a speaker dependent (SD) Gaussian mixture model (GMM) for the target speaker and additionally, we also estimate a speaker independent (SI) GMM by using the data from a number of speakers other than the source speaker. A mapping correlation between the SD and the SI model is maintained during the training process in terms of each phone label. At the conversion stage, we use the SI GMM to recognize each input frame and find the closest Gaussian mixture for it. Next, according to a mapping list, the counterpart Gaussian of the SD GMM is obtained and then used to generate a parameter vector for each frame vector. Finally all the generated vectors are concatenated to synthesize speech of the target speaker. By using the Proposed model-mapping approach, we can not only avoid the over-fitting problem by keeping the number of mixtures of the SI GMM to a fixed value, but also simultaneously improve voice quality in terms of similarity and naturalness by increasing the number of mixtures of the SD GMM. Experiments showed the effectiveness of this method.
引用
收藏
页码:442 / 446
页数:5
相关论文
共 50 条
  • [21] Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
    Tachibana, Makoto
    Izawa, Shinsuke
    Nose, Takashi
    Kobayashi, Takao
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4633 - 4636
  • [22] A Statistical Model-Based Voice Activity Detection Using Multiple DNNs and Noise Awareness
    Hwang, Inyoung
    Sim, Jaeseong
    Kim, Sang-Hyeon
    Song, Kwang-Sub
    Chang, Joon-Hyuk
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2277 - 2281
  • [23] Voice Conversion Using Bilinear Model Integrated with Joint GMM-based Classification
    Sun, Xinjian
    Zhang, Xiongwei
    Yang, Jibin
    Cao, Tieyong
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 1225 - 1228
  • [24] Reducing over-smoothness in HMM-based speech synthesis using exemplar-based voice conversion
    Gia-Nhu Nguyen
    Trung-Nghia Phung
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2017,
  • [25] Reducing over-smoothness in HMM-based speech synthesis using exemplar-based voice conversion
    Gia-Nhu Nguyen
    Trung-Nghia Phung
    EURASIP Journal on Audio, Speech, and Music Processing, 2017
  • [26] Model-based camera calibration using analysis by synthesis techniques
    Eisert, P
    VISION MODELING, AND VISUALIZATION 2002, PROCEEDINGS, 2002, : 307 - 314
  • [27] A novel voice activity detection based on phoneme recognition using statistical model
    Bao, Xulei
    Zhu, Jie
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012,
  • [28] A novel voice activity detection based on phoneme recognition using statistical model
    Xulei Bao
    Jie Zhu
    EURASIP Journal on Audio, Speech, and Music Processing, 2012
  • [29] A hidden semi-Markov model-based speech synthesis system
    Zen, Heiga
    Tokuda, Keiichi
    Masuko, Takashi
    Kobayasih, Takao
    Kitamura, Tadashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (05): : 825 - 834
  • [30] A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
    Lachhab, Othman
    Di Martino, Joseph
    Ibn Elhaj, Elhassane
    Hammouch, Ahmed
    SPRINGERPLUS, 2015, 4