High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

被引:0
|
作者
Fujii, Kei [1 ]
Okawa, Jun [1 ]
Suigetsu, Kaori [1 ]
机构
[1] Kumamoto Natl Coll Technol, Dept Informat & Comp Sci, Kohshi City, Kumamoto 8611102, Japan
关键词
concatenative speech synthesis; join cost; speaker individuality; unit selection; voice conversion;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.
引用
收藏
页码:483 / 488
页数:6
相关论文
共 50 条
  • [21] A corpus-based concatenative Mandarin singing voice synthesis system
    Zhou, Shu-Sen
    Chen, Qing-Cai
    Wang, Dan-Dan
    Yang, Xiao-Hong
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2695 - 2699
  • [22] LSM-based boundary training for concatenative speech synthesis
    Bellegarda, Jerome R.
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 721 - 724
  • [23] Discriminative training for concatenative speech synthesis
    Kim, NS
    Park, SS
    IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (01) : 40 - 43
  • [24] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [25] CUTE: A CONCATENATIVE METHOD FOR VOICE CONVERSION USING EXEMPLAR-BASED UNIT SELECTION
    Jin, Zeyu
    Finkelstein, Adam
    DiVerdi, Stephen
    Lu, Jingwan
    Mysore, Gautham J.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5660 - 5664
  • [26] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
    Hirai, T
    Tenpaku, S
    Shikano, K
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
  • [27] An auditory-based distortion measure with application to concatenative speech synthesis
    Hansen, JHL
    Chappell, DT
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (05): : 489 - 495
  • [28] Forward masking phenomenon in concatenative speech synthesis
    Cernak, M
    Rozinaj, G
    PROCEEDINGS EC-VIP-MC 2003, VOLS 1 AND 2, 2003, : 691 - 694
  • [29] A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis
    Ahmad, Muhammad Rizwan
    Arshad, Muhammad Junaid
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2016, 35 (03) : 373 - 380
  • [30] Auditory-based distortion measure with application to concatenative speech synthesis
    Duke Univ, Durham, United States
    IEEE Trans Speech Audio Process, 5 (489-495):