VOICE CONVERSION BASED ON MATRIX VARIATE GAUSSIAN MIXTURE MODEL

被引:0
|
作者
Saito, Daisuke [1 ]
Doi, Hidenobu [1 ]
Minematsu, Nobuaki [1 ]
Hirose, Keikichi [1 ]
机构
[1] Univ Tokyo, Tokyo, Japan
关键词
Voice conversion; Gaussian mixture model; matrix variate distribution; matrix variate normal; matrix variate Gaussian mixture model; SPEECH RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes a novel approach to construct a mapping function between a given speaker pair using probability density functions (PDF) of matrix variate. In voice conversion studies, two important functions should be realized: 1) precise modeling of both the source and target feature spaces, and 2) construction of a proper transform function between these spaces. Voice conversion based on Gaussian mixture model (GMM) is widely used because of their flexibility and easiness in handling. In GMM-based approaches, a joint vector space of the source and target is first constructed, and the joint PDF of the two vectors is modeled as GMM in the joint vector space. The joint vector approach mainly focuses on precise modeling of the 'joint' feature space, and does not always construct a proper transform between two feature spaces. In contrast, the proposed method constructs the joint PDF as GMM in a matrix variate space whose row and column respectively correspond to the two functions, and it has potential to precisely model both the characteristics of the feature spaces and the relation between the source and target spaces. Experimental results show that the proposed method contributes to improve the performance of voice conversion.
引用
收藏
页码:567 / 571
页数:5
相关论文
共 50 条
  • [31] A theoretical framework for Landsat data modeling based on the matrix variate mean-mixture of normal model
    Naderi, Mehrdad
    Bekker, Andriette
    Arashi, Mohammad
    Jamalizadeh, Ahad
    PLOS ONE, 2020, 15 (04):
  • [32] ROBUST ONLINE MATRIX COMPLETION WITH GAUSSIAN MIXTURE MODEL
    Liu, Chunsheng
    Chen, Chunlei
    Shan, Hong
    Wang, Bin
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3422 - 3426
  • [33] EEG Signal Classification Using Manifold Learning and Matrix-Variate Gaussian Model
    Zhu, Lei
    Hu, Qifeng
    Yang, Junting
    Zhang, Jianhai
    Xu, Ping
    Ying, Nanjiao
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [34] TESTING INDEPENDENCE OF COMPONENTS OF A GAUSSIAN COMPLEX MATRIX VARIATE
    GUPTA, AK
    ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (04): : 1474 - &
  • [35] GRAPH ESTIMATION FOR MATRIX-VARIATE GAUSSIAN DATA
    Chen, Xi
    Liu, Weidong
    STATISTICA SINICA, 2019, 29 (01) : 479 - 504
  • [36] EXPONENTIAL SCALE MIXTURE OF MATRIX VARIATE CAUCHY DISTRIBUTION
    Sarr, Amadou
    Gupta, Arjun K.
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 2011, 139 (04) : 1483 - 1494
  • [37] Gaussian Process Experts for Voice Conversion
    Pilkington, Nicholas C. V.
    Zen, Heiga
    Gales, Mark J. F.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2772 - +
  • [38] A DYNAMIC GAUSSIAN PROCESS FOR VOICE CONVERSION
    Huang, Dong-Yan
    Dong, Minghui
    Li, Haizhou
    ELECTRONIC PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2013,
  • [39] Gender based Voice Authentication Using Gaussian Mixture Model and Mel-Frequency Cepstrum Coefficients
    Rajeh, Wahid
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (01): : 539 - 545
  • [40] Ensemble Gaussian mixture model-based special voice command cognitive computing intelligent system
    Saravanan, P.
    Ram, E. Sri
    Jangiti, Saikishor
    Ponmani, E.
    Ravi, Logesh
    Subramaniyaswamy, V
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (06) : 8181 - 8189