High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

被引：0

作者：

Fujii, Kei ^{[1
]}

Okawa, Jun ^{[1
]}

Suigetsu, Kaori ^{[1
]}

机构：

[1] Kumamoto Natl Coll Technol, Dept Informat & Comp Sci, Kohshi City, Kumamoto 8611102, Japan

来源：

PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007 | 2007年 / 26卷

关键词：

concatenative speech synthesis; join cost; speaker individuality; unit selection; voice conversion;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

引用

页码：483 / 488

页数：6

共 50 条

[31] Concatenative speech synthesis based on the plural unit selection and fusion method
Mizutani, T
Kagoshima, T
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
[32] Automatic Labeling Schemes for Concatenative Speech Synthesis
Kacur, Juraj
Cepko, Jozef
Palenik, Andrej
PROCEEDINGS ELMAR-2008, VOLS 1 AND 2, 2008, : 639 - 642
[33] Spectral voice conversion for text-to-speech synthesis
Kain, A
Macon, MW
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 285 - 288
[34] Synthesis of Child Speech With HMM Adaptation and Voice Conversion
Watts, Oliver
Yamagishi, Junichi
King, Simon
Berkling, Kay
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 1005 - 1016
[35] SPEECH SEGMENT SELECTION FOR CONCATENATIVE SYNTHESIS BASED ON SPECTRAL DISTORTION MINIMIZATION
IWAHASHI, N
KAIKI, N
SAGISAKA, Y
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1942 - 1948
[36] Acoustic speech unit segmentation for concatenative synthesis
Torres, H. M.
Gurlekian, J. A.
COMPUTER SPEECH AND LANGUAGE, 2008, 22 (02): : 196 - 206
[37] Control of spectral dynamics in concatenative speech synthesis
Wouters, J
Macon, MW
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01): : 30 - 38
[38] A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis
Tuerk, Oytun
Schroeder, Marc
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2282 - 2285
[39] Nonlinear speech features for the objective detection of discontinuities in concatenative speech synthesis
Pantazis, Y
Stylianou, Y
NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 375 - 383
[40] Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion
Amini, Jamal
Shahrebabaki, Abdoreza Sabzi
Shokouhi, Navid
Sheikhzadeh, Hamid
Raahemifa, Kaamran
Eslami, Mehdi
2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 428 - 433

← 1 2 3 4 5 →