Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion

被引:0
|
作者
Hwang, Hsin-Te [1 ,3 ]
Tsao, Yu [2 ]
Wang, Hsin-Min [3 ]
Wang, Yih-Ru [1 ]
Chen, Sin-Horng [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
[2] Acad Sinica, Res Ctr Infomrat Technol Innovat, Taipei, Taiwan
[3] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Maximum likelihood-based trajectory mapping considering global variance (MLGV-based trajectory mapping) has been proposed for improving the quality of the converted speech of Gaussian mixture model-based voice conversion (GMM-based VC). Although the quality of the converted speech is significantly improved, the computational cost of the online conversion process is also increased because there is no closed form solution for parameter generation in MLGV-based trajectory mapping, and an iterative process is generally required. To reduce the online computational cost, we propose to incorporate GV in the training phase of GMM-based VC. Then, the conversion process can simply adopt ML-based trajectory mapping (without considering GV in the conversion phase), which has a closed form solution. In this way, it is expected that the quality of the converted speech can be improved without increasing the online computational cost. Our experimental results demonstrate that the proposed method yields a significant improvement in the quality of the converted speech comparing to the conventional GMM-based VC method. Meanwhile, comparing to MLGV-based trajectory mapping, the proposed method provides comparable converted speech quality with reduced computational cost in the conversion process.
引用
收藏
页数:6
相关论文
共 50 条
  • [11] Improving the Quality of Standard GMM-Based Voice Conversion Systems by Considering Physically Motivated Linear Transformations
    Zorila, Tudor-Catalin
    Erro, Daniel
    Hernaez, Inma
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, 2012, 328 : 30 - 39
  • [12] Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
    Tanaka, Kei
    Hara, Sunao
    Abe, Masanobu
    Sato, Masaaki
    Minagi, Shogo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3384 - 3388
  • [13] A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models
    Takamichi, Shinnosuke
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2490 - 2498
  • [14] Improving the computational performance of standard GMM-based voice conversion systems used in real-time applications
    Ben Othmane, Imen
    Di Martino, Joseph
    Ouni, Kais
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,
  • [15] NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON FT-GMM
    Chen, Ling-Hui
    Ling, Zhen-Hua
    Dai, Li-Rong
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5116 - 5119
  • [16] Modified method for voice conversion based on GMM
    Shen, Yi
    Jian, Zhi-Hua
    Yang, Zhen
    Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science), 2007, 27 (05): : 11 - 15
  • [17] Voice Conversion Based on Hybrid SVR and GMM
    Song, Peng
    Jin, Yun
    Zhao, Li
    Zou, Cairong
    ARCHIVES OF ACOUSTICS, 2012, 37 (02) : 143 - 149
  • [18] Voice Conversion Based on State Space Model and Considering Global Variance
    Ahangar, Mohsen
    Ghorbandoost, Mostafa
    Sheikhzadeh, Hamid
    Raahemifar, Kaamran
    Shahrebabaki, Abdoreza Sabzi
    Amini, Jamal
    2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 416 - 421
  • [19] Improving the Performance of GMM Based Voice Conversion Method
    Song, Peng
    Zhao, Li
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 436 - 440
  • [20] MODULAR GLOBAL VARIANCE ENHANCEMENT FOR VOICE CONVERSION SYSTEMS
    Benisty, H.
    Malah, D.
    Crammer, K.
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 370 - 374