Improving the performance of MGM-based voice conversion by preparing training data method

被引:0
|
作者
Zuo, GY [1 ]
Liu, WJ [1 ]
Ruan, XG [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an approach to improve both the target speaker's individuality and the quality of the converted speech by preparing the training data. In mixture Gaussian spectral mapping (MGM) based voice conversion, spectral features representations are analyzed to obtain the right feature associations between the source and target characteristics. A voiced and unvoiced (V/UV) decision scheme for time-alignment is provided to obtain the right data for training mixture Gaussian spectral mapping function while removing the misaligned data. Experiments are conducted in terms of the applications of spectral representation methods and V/UV decisions strategies to the MGM functions. When linear predictive cepstral coefficients (LPCC) are used for time-alignment and the V/UV decisions are adopted for removing bad data, results show that the conversion function can get a better accuracy and the proposed method can effectively improve the overall performance of voice conversion.
引用
收藏
页码:181 / 184
页数:4
相关论文
共 50 条
  • [31] Improving the performance of ANN training with an unsupervised filtering method
    Remy, Sekou
    Park, Chung Hyuk
    Howard, Ayanna M.
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 747 - 753
  • [32] IMPROVING ADVERSARIAL WAVEFORM GENERATION BASED SINGING VOICE CONVERSION WITH HARMONIC SIGNALS
    Guo, Haohan
    Zhou, Zhiping
    Meng, Fanbo
    Liu, Kai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6657 - 6661
  • [33] Improving the computational performance of standard GMM-based voice conversion systems used in real-time applications
    Ben Othmane, Imen
    Di Martino, Joseph
    Ouni, Kais
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, CONTROL, OPTIMIZATION AND COMPUTER SCIENCE (ICECOCS), 2018,
  • [34] VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
    Lu, Junchen
    Zhou, Kun
    Sisman, Berrak
    Li, Haizhou
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 514 - 519
  • [35] Training data selection for voice conversion using speaker selection and vector field smoothing
    Hashimoto, M
    Higuchi, N
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1397 - 1400
  • [36] ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data
    Lian, Zheng
    Wen, Zhengqi
    Zhou, Xinyong
    Pu, Songbai
    Zhang, Shengkai
    Tao, Jianhua
    INTERSPEECH 2020, 2020, : 4706 - 4710
  • [37] PHONETIC POSTERIORGRAMS FOR MANY-TO-ONE VOICE CONVERSION WITHOUT PARALLEL DATA TRAINING
    Sun, Lifa
    Li, Kun
    Wang, Hao
    Kang, Shiyin
    Meng, Helen
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [38] Evaluation of a Singing Voice Conversion Method Based on Many-to-Many Eigenvoice Conversion
    Doi, Hironori
    Toda, Tomoki
    Nakano, Tomoyasu
    Goto, Masataka
    Nakamura, Satoshi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1066 - 1070
  • [39] Many-to-Many Voice Conversion based on Bottleneck Features with Variational Autoencoder for Non-parallel Training Data
    Li, Yanping
    Lee, Kong Aik
    Yuan, Yougen
    Li, Haizhou
    Yang, Zhen
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 829 - 833
  • [40] Incorporating Global Variance in the Training Phase of GMM-based Voice Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,