Voice Conversion in High-order Eigen Space Using Deep Belief Nets

被引:0
|
作者
Nakashika, Toru [1 ]
Takashima, Ryoichi [1 ]
Takiguchi, Tetsuya [2 ]
Ariki, Yasuo [2 ]
机构
[1] Kobe Univ, Grad Sch Syst Informat, 1-1 Rokkodai, Kobe, Hyogo, Japan
[2] Kobe Univ, Org Adv Sci & Technol, Kobe, Hyogo, Japan
关键词
voice conversion; deep learning; deep belief nets; SPEECH RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a voice conversion technique using Deep Belief Nets (DBNs) to build high-order eigen spaces of the source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. DBNs have a deep architecture that automatically discovers abstractions to maximally express the original input features. If we train the DBNs using only the speech of an individual speaker, it can be considered that there is less phonological information and relatively more speaker individuality in the output features at the highest layer. Training the DBNs for a source speaker and a target speaker, we can then connect and convert the speaker individuality abstractions using Neural Networks (NNs). The converted abstraction of the source speaker is then brought back to the cepstrum space using an inverse process of the DBNs of the target speaker. We conducted speaker voice conversion experiments and confirmed the efficacy of our method with respect to subjective and objective criteria, comparing it with the conventional Gaussian Mixture Model -based method.
引用
收藏
页码:369 / 372
页数:4
相关论文
共 50 条
  • [1] Voice conversion using structured Gaussian mixture model in eigen space
    Li, Yangchun
    Yu, Yibiao
    Shengxue Xuebao/Acta Acustica, 2015, 40 (01): : 12 - 19
  • [2] Voice Jitter Estimation Using High-Order Synchrosqueezing Operators
    Miramont, Juan M.
    Colominas, Marcelo Alejandro
    Schlotthauer, Gaston
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 527 - 536
  • [3] A DISTRIBUTED SIMULATOR FOR HIGH-ORDER PETRI NETS
    BUTLER, B
    ESSER, R
    MATTMANN, R
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 483 : 47 - 63
  • [4] Improved deep belief network and its application in voice conversion
    Wang W.-H.
    Zhang X.
    Wan Y.-J.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (12): : 2372 - 2380
  • [5] High-frequency Restoration Using Deep Belief Nets for Super-resolution
    Nakashika, Toru
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 38 - 42
  • [6] High-order space charge effects using automatic differentiation
    Reusch, MF
    Bruhwiler, DL
    COMPUTATIONAL ACCELERATOR PHYSICS, 1997, (391): : 179 - 184
  • [7] Stepping into high-order interpolation in space
    Rapetti, Francesca
    Aubry, Erwann
    BOLLETTINO DELLA UNIONE MATEMATICA ITALIANA, 2025, 18 (01): : 311 - 325
  • [8] Deep high-order supervised hashing
    Cheng, Jing Dong
    Sun, Qiu Le
    Zhang, Jian Xin
    Desrosiers, Christian
    Liu, Bin
    Lu, Jian
    Zhang, Qiang
    OPTIK, 2019, 180 : 847 - 857
  • [9] Real-Time One-Shot Voice Conversion Based on High-Order Recursive Networks
    Lin, ChingShun
    Lai, JunWei
    2024 8TH INTERNATIONAL CONFERENCE ON IMAGING, SIGNAL PROCESSING AND COMMUNICATIONS, ICISPC 2024, 2024, : 129 - 133
  • [10] Voice Conversion Using Deep Neural Network in Super-Frame Feature Space
    Ye, Wei
    Yu, Yibiao
    2015 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2015, : 465 - 468