Voice Conversion in High-order Eigen Space Using Deep Belief Nets

被引：0

作者：

Nakashika, Toru ^{[1
]}

Takashima, Ryoichi ^{[1
]}

Takiguchi, Tetsuya ^{[2
]}

Ariki, Yasuo ^{[2
]}

机构：

[1] Kobe Univ, Grad Sch Syst Informat, 1-1 Rokkodai, Kobe, Hyogo, Japan

[2] Kobe Univ, Org Adv Sci & Technol, Kobe, Hyogo, Japan

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

voice conversion; deep learning; deep belief nets; SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a voice conversion technique using Deep Belief Nets (DBNs) to build high-order eigen spaces of the source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. DBNs have a deep architecture that automatically discovers abstractions to maximally express the original input features. If we train the DBNs using only the speech of an individual speaker, it can be considered that there is less phonological information and relatively more speaker individuality in the output features at the highest layer. Training the DBNs for a source speaker and a target speaker, we can then connect and convert the speaker individuality abstractions using Neural Networks (NNs). The converted abstraction of the source speaker is then brought back to the cepstrum space using an inverse process of the DBNs of the target speaker. We conducted speaker voice conversion experiments and confirmed the efficacy of our method with respect to subjective and objective criteria, comparing it with the conventional Gaussian Mixture Model -based method.

引用

页码：369 / 372

页数：4

共 50 条

[1] Voice conversion using structured Gaussian mixture model in eigen space
Li, Yangchun
Yu, Yibiao
Shengxue Xuebao/Acta Acustica, 2015, 40 (01): : 12 - 19
[2] Voice Jitter Estimation Using High-Order Synchrosqueezing Operators
Miramont, Juan M.
Colominas, Marcelo Alejandro
Schlotthauer, Gaston
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 527 - 536
[3] A DISTRIBUTED SIMULATOR FOR HIGH-ORDER PETRI NETS
BUTLER, B
ESSER, R
MATTMANN, R
LECTURE NOTES IN COMPUTER SCIENCE, 1991, 483 : 47 - 63
[4] Improved deep belief network and its application in voice conversion
Wang W.-H.
Zhang X.
Wan Y.-J.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (12): : 2372 - 2380
[5] High-frequency Restoration Using Deep Belief Nets for Super-resolution
Nakashika, Toru
Takiguchi, Tetsuya
Ariki, Yasuo
2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 38 - 42
[6] High-order space charge effects using automatic differentiation
Reusch, MF
Bruhwiler, DL
COMPUTATIONAL ACCELERATOR PHYSICS, 1997, (391): : 179 - 184
[7] Stepping into high-order interpolation in space
Rapetti, Francesca
Aubry, Erwann
BOLLETTINO DELLA UNIONE MATEMATICA ITALIANA, 2025, 18 (01): : 311 - 325
[8] Deep high-order supervised hashing
Cheng, Jing Dong
Sun, Qiu Le
Zhang, Jian Xin
Desrosiers, Christian
Liu, Bin
Lu, Jian
Zhang, Qiang
OPTIK, 2019, 180 : 847 - 857
[9] Real-Time One-Shot Voice Conversion Based on High-Order Recursive Networks
Lin, ChingShun
Lai, JunWei
2024 8TH INTERNATIONAL CONFERENCE ON IMAGING, SIGNAL PROCESSING AND COMMUNICATIONS, ICISPC 2024, 2024, : 129 - 133
[10] Voice Conversion Using Deep Neural Network in Super-Frame Feature Space
Ye, Wei
Yu, Yibiao
2015 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2015, : 465 - 468

← 1 2 3 4 5 →