VTLN-based voice conversion

被引:0
|
作者
Sündermann, D [1 ]
Ney, H [1 ]
机构
[1] Tech Univ, RWTH Aachen, Comp Sci Dept, D-52056 Aachen, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As voice conversion aims at the transformation of a source speaker's voice into that of a target speaker, we want to investigate whether VTLN is an appropriate method to adapt the voice characteristics. After applying several conventional VTLN warping functions, we extend the piecewise linear function to several segments, allowing a more detailed warping of the source spectrum. Experiments on voice conversion are performed on three corpora of two languages and both speaker genders.
引用
收藏
页码:556 / 559
页数:4
相关论文
共 50 条
  • [1] VTLN-based cross-language voice conversion
    Sündermann, D
    Ney, H
    Höge, H
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 676 - 681
  • [2] Voice characteristics conversion for TTS using reverse VTLN
    Eichner, M
    Wolff, M
    Hoffmann, R
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 17 - 20
  • [3] STATISTICAL VOICE CONVERSION BASED ON WAVENET
    Niwa, Jumpei
    Yoshimura, Takenori
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5289 - 5293
  • [4] Controllable voice conversion based on quantization of voice factor scores
    Isako, Takumi
    Onishi, Kotaro
    Kishida, Takuya
    Nakashika, Toru
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1444 - 1448
  • [5] Face-based Voice Conversion: Learning the Voice behind a Face
    Lu, Hsiao-Han
    Weng, Shao-En
    Yen, Ya-Fan
    Shuai, Hong-Han
    Cheng, Wen-Huang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 496 - 505
  • [6] Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
    Kobayashi, Kazuhiro
    Toda, Tomoki
    Doi, Hironori
    Nakano, Tomoyasu
    Goto, Masataka
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06): : 1419 - 1428
  • [7] Voice Conversion Based on Locally Linear Embedding
    Hwang, Hsin-Te
    Wu, Yi-Chiao
    Peng, Yu-Huai
    Hsu, Chin-Cheng
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2018, 34 (06) : 1493 - 1516
  • [8] Voice Conversion Based on Mixtures of Factor Analyzers
    Uto, Yosuke
    Nankaku, Yoshihiko
    Toda, Tomoki
    Lee, Akinobu
    Tokuda, Keiichi
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2278 - +
  • [9] Voice Conversion Based on Weighted Frequency Warping
    Erro, Daniel
    Moreno, Asuncion
    Bonafonte, Antonio
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 922 - 931
  • [10] VOICE CONVERSION BASED ON A MIXTURE DENSITY NETWORK
    Ahangar, Mohsen
    Ghorbandoost, Mostafa
    Sharma, Sudhendu
    Smith, Mark J. T.
    2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 329 - 333