Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck

被引:0
|
作者
Lee, Sang-Hoon [1 ]
Noh, Hyeong-Rae [1 ]
Nam, Woo-Jeoung [2 ]
Lee, Seong-Whan [3 ]
机构
[1] Department of Brain and Cognitive Engineering, Korea University, Seoul,02841, Korea, Republic of
[2] Department of Computer and Radio Communications Engineering, Korea University, Seoul,02841, Korea, Republic of
[3] Department of Artificial Intelligence, Korea University, Seoul,02841, Korea, Republic of
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
页码:1173 / 1183
相关论文
共 36 条
  • [1] Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck
    Lee, Sang-Hoon
    Noh, Hyeong-Rae
    Nam, Woo-Jeoung
    Lee, Seong-Whan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1173 - 1183
  • [2] Phoneme-based spectral voice conversion using temporal decomposition and Gaussian mixture model
    Nguyen, Binh Phu
    Akagi, Masato
    2008 SECOND INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, 2008, : 222 - 227
  • [3] High quality voice conversion through phoneme-based linear mapping functions with STRAIGHT for mandarin
    Liu, Kun
    Zhang, Jianping
    Yan, Yonghong
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2007, : 410 - 414
  • [4] Phoneme-based speech recognition via fuzzy neural networks modeling and learning
    Kasabov, NK
    Kozma, R
    Watts, MJ
    INFORMATION SCIENCES, 1998, 110 (1-2) : 61 - 79
  • [5] Phoneme Background Model for Information Bottleneck based Speaker Diarization
    Yella, Sree Harsha
    Motlicek, Petr
    Bourlard, Herve
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 597 - 601
  • [6] Controllable voice conversion based on quantization of voice factor scores
    Isako, Takumi
    Onishi, Kotaro
    Kishida, Takuya
    Nakashika, Toru
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1444 - 1448
  • [7] DISENTANGLING CONTENT AND FINE-GRAINED PROSODY INFORMATION VIA HYBRID ASR BOTTLENECK FEATURES FOR VOICE CONVERSION
    Zhao, Xintao
    Liu, Feng
    Song, Changhe
    Wu, Zhiyong
    Kang, Shiyin
    Tuo, Deyi
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7022 - 7026
  • [8] Phoneme Hallucinator: One-Shot Voice Conversion via Set Expansion
    Shan, Siyuan
    Li, Yang
    Banerjee, Amartya
    Oliva, Junier B.
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14910 - 14918
  • [9] CONTROLLABLE SPEECH REPRESENTATION LEARNING VIA VOICE CONVERSION AND AIC LOSS
    Wang, Yunyun
    Su, Jiaqi
    Finkelstein, Adam
    Jin, Zeyu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6682 - 6686
  • [10] PHONEME CLUSTER BASED STATE MAPPING FOR TEXT-INDEPENDENT VOICE CONVERSION
    Zhang, Meng
    Tao, Jiaohua
    Nurminen, Jani
    Tian, Jilei
    Wang, Xia
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4281 - +