Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion

被引:0
|
作者
Sisman, Berrak [1 ]
Li, Haizhou [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
关键词
Wavelet transform; prosody analysis; voice conversion;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thus far, voice conversion studies are mainly focused on the conversion of spectrum. However, speaker identity is also characterized by its prosody features, such as fundamental frequency (F0) and energy contour. We believe that with a better understanding of speaker dependent/independent prosody features, we can devise an analytic approach that addresses voice conversion in a better way. We consider that speaker dependent features reflect speaker's individuality, while speaker independent features reflect the expression of linguistic content. Therefore, the former is to be converted while the latter is to be carried over from source to target during the conversion. To achieve this, we provide an analysis of speaker dependent and speaker independent prosody patterns in different temporal scales by using wavelet transform. The centrepiece of this paper is based on the understanding that a speech utterance can be characterized by speaker dependent and independent features in its prosodic manifestations. Experiments show that the proposed prosody analysis scheme improves the prosody conversion performance consistently under the sparse representation framework.
引用
收藏
页码:52 / 56
页数:5
相关论文
共 50 条
  • [21] Speaker attribution of successive utterances: The role of discontinuities in voice characteristics and prosody
    Lublinskaja, V
    Sappok, C
    SPEECH COMMUNICATION, 1996, 19 (02) : 145 - 159
  • [22] Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN
    Du, Zongyang
    Zhou, Kun
    Sisman, Barrak
    Li, Haizhou
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 507 - 513
  • [23] Decoupling Speaker-Independent Emotions for Voice Conversion via Source-Filter Networks
    Luo, Zhaojie
    Lin, Shoufeng
    Liu, Rui
    Baba, Jun
    Yoshikawa, Yuichiro
    Ishiguro, Hiroshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 11 - 24
  • [24] Speaker-independent HMM-based Voice Conversion Using Quantized Fundamental Frequency
    Nose, Takashi
    Kobayashi, Takao
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1724 - 1727
  • [25] VOICE CONVERSION USING DEEP NEURAL NETWORKS WITH SPEAKER-INDEPENDENT PRE-TRAINING
    Mohammadi, Seyed Hamidreza
    Kain, Alexander
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 19 - 23
  • [26] Improving Speech Intelligibility through Speaker Dependent and Independent Spectral Style Conversion
    Tuan Dinh
    Kain, Alexander
    Tjaden, Kris
    INTERSPEECH 2020, 2020, : 1146 - 1150
  • [27] Speaker and Noise Independent Voice Activity Detection
    Germain, Francois G.
    Sun, Dennis L.
    Mysore, Gautham J.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 732 - 736
  • [28] Voice conversion based on static speaker characteristics
    Schwardt, L.C.
    du Preez, J.A.
    Proceedings of the South African Symposium on Communications and Signal Processing, COMSIG, 1998, : 57 - 62
  • [29] Voice conversion versus speaker verification: an overview
    Wu, Zhizheng
    Li, Haizhou
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2014, 3 (03)
  • [30] Automatic source speaker selection for voice conversion
    Turk, Oytun
    Arslan, Levent M.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : 480 - 491