Multimodal voice conversion based on non-negative matrix factorization

被引:0
|
作者
Kenta Masaka
Ryo Aihara
Tetsuya Takiguchi
Yasuo Ariki
机构
[1] Kobe University,Graduate School of System Informatics
[2] Kobe University,Organization of Advanced Science and Technology
关键词
Voice conversion; Multimodal; Image features; Non-negative matrix factorization; Noise robustness;
D O I
暂无
中图分类号
学科分类号
摘要
A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. In this study, we propose multimodal VC that improves the noise robustness of our NMF-based VC method. Furthermore, we introduce the combination weight between audio and visual features and formulate a new cost function to estimate audio-visual exemplars. Using the joint audio-visual features as source features, VC performance is improved compared with that of a previous audio-input exemplar-based VC method. The effectiveness of the proposed method is confirmed by comparing its effectiveness with that of a conventional audio-input NMF-based method and a Gaussian mixture model-based method.
引用
收藏
相关论文
共 50 条
  • [21] Matrix transformation based non-negative matrix factorization algorithm
    Li, Fang
    Zhu, Qun-Xiong
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2010, 33 (04): : 118 - 120
  • [22] Speaker Clustering Based on Non-negative Matrix Factorization
    Nishida, Masafumi
    Yamamoto, Seiichi
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 956 - 959
  • [23] Image fusion based on non-negative matrix factorization
    Zhang, JY
    Wei, L
    Miao, QG
    Wang, Y
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 973 - 976
  • [24] IMAGE PREDICTION BASED ON NON-NEGATIVE MATRIX FACTORIZATION
    Turkan, Mehmet
    Guillemot, Christine
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 789 - 792
  • [25] Link prediction based on non-negative matrix factorization
    Chen, Bolun
    Li, Fenfen
    Chen, Senbo
    Hu, Ronglin
    Chen, Ling
    PLOS ONE, 2017, 12 (08):
  • [26] Dropout non-negative matrix factorization
    Zhicheng He
    Jie Liu
    Caihua Liu
    Yuan Wang
    Airu Yin
    Yalou Huang
    Knowledge and Information Systems, 2019, 60 : 781 - 806
  • [27] Non-negative matrix factorization on kernels
    Zhang, Daoqiang
    Zhou, Zhi-Hua
    Chen, Songcan
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 404 - 412
  • [28] Non-negative Matrix Factorization: A Survey
    Gan, Jiangzhang
    Liu, Tong
    Li, Li
    Zhang, Jilian
    COMPUTER JOURNAL, 2021, 64 (07): : 1080 - 1092
  • [29] Collaborative Non-negative Matrix Factorization
    Benlamine, Kaoutar
    Grozavu, Nistor
    Bennani, Younes
    Matei, Basarab
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 655 - 666
  • [30] INFINITE NON-NEGATIVE MATRIX FACTORIZATION
    Schmidt, Mikkel N.
    Morup, Morten
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 905 - 909