High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

被引：0

作者：

Fujii, Kei ^{[1
]}

Okawa, Jun ^{[1
]}

Suigetsu, Kaori ^{[1
]}

机构：

[1] Kumamoto Natl Coll Technol, Dept Informat & Comp Sci, Kohshi City, Kumamoto 8611102, Japan

来源：

PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007 | 2007年 / 26卷

关键词：

concatenative speech synthesis; join cost; speaker individuality; unit selection; voice conversion;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

引用

页码：483 / 488

页数：6

共 50 条

[41] Unit database pruning based on the cost degradation criterion for concatenative speech synthesis
Nishizawa, Nobuyuki
Kawai, Hisashi
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3969 - 3972
[42] Development of Concatenative Syllable-Based Text to Speech Synthesis System for Tamil
Sudhakar, B.
Bensraj, R.
ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 1, 2015, 324 : 585 - 592
[43] Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence
Adell, Jordi
Escudero, David
Bonafonte, Antonio
SPEECH COMMUNICATION, 2012, 54 (03) : 459 - 476
[44] Electrolaryngeal Speech Enhancement Based on Statistical Voice Conversion
Nakamura, Keigo
Toda, Tomoki
Saruwatari, Hiroshi
Shikano, Kiyohiro
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1443 - 1446
[45] Context-adaptive smoothing for concatenative speech synthesis
Lee, KS
Kim, SR
IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (12) : 422 - 425
[46] Voice quality conversion in TD-PSOLA speech synthesis
Sun, XJ
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 953 - 956
[47] HMM adaptation and voice conversion for the synthesis of child speech: a comparison
Watts, Oliver
Yamagishi, Junichi
King, Simon
Berkling, Kay
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2595 - +
[48] The phase substitutions in Czech harmonic concatenative speech synthesis
Tychtl, Z
Matous, K
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 333 - 340
[49] A segmental speech coder based on a concatenative TTS
Lee, KS
Cox, RV
SPEECH COMMUNICATION, 2002, 38 (1-2) : 89 - 100
[50] An evaluation of automatic phone segmentation for concatenative speech synthesis
Kawai, H
Toda, T
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 677 - 680

← 1 2 3 4 5 →