Multimodal voice conversion based on non-negative matrix factorization

被引：0

作者：

Kenta Masaka

Ryo Aihara

Tetsuya Takiguchi

Yasuo Ariki

机构：

[1] Kobe University,Graduate School of System Informatics

[2] Kobe University,Organization of Advanced Science and Technology

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2015卷

关键词：

Voice conversion; Multimodal; Image features; Non-negative matrix factorization; Noise robustness;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel training data, in which the same texts are uttered by the source and target speakers. The input source signal is then decomposed into source exemplars, noise exemplars, and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. In this study, we propose multimodal VC that improves the noise robustness of our NMF-based VC method. Furthermore, we introduce the combination weight between audio and visual features and formulate a new cost function to estimate audio-visual exemplars. Using the joint audio-visual features as source features, VC performance is improved compared with that of a previous audio-input exemplar-based VC method. The effectiveness of the proposed method is confirmed by comparing its effectiveness with that of a conventional audio-input NMF-based method and a Gaussian mixture model-based method.

引用

共 50 条

[1] Multimodal voice conversion based on non-negative matrix factorization
Masaka, Kenta
Aihara, Ryo
Takiguchi, Tetsuya
Ariki, Yasuo
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,
[2] MULTIMODAL VOICE CONVERSION USING NON-NEGATIVE MATRIX FACTORIZATION IN NOISY ENVIRONMENTS
Masaka, Kenta
Aihara, Ryo
Takiguchi, Tetsuya
Ariki, Yasuo
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[3] Voice Conversion based on Non-negative Matrix Factorization in Noisy Environments
Fujii, Takao
Aihara, Ryo
Takashima, Ryoichi
Takiguchi, Tetsuya
Ariki, Yasuo
2013 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2013, : 495 - 498
[4] The Voice Conversion Method Based on Sparse Convolutive Non-negative Matrix Factorization
Zhang, Qianmin
Tao, Liang
Zhou, Jian
Wang, Huabin
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ELECTRICAL AND INFORMATION TECHNOLOGIES FOR RAIL TRANSPORTATION: TRANSPORTATION, 2016, 378 : 259 - 267
[5] Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization
Aihara, Ryo
Takiguchi, Testuya
Ariki, Yasuo
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2749 - 2753
[6] Exemplar-based Emotional Voice Conversion Using Non-negative Matrix Factorization
Aihara, Ryo
Ueda, Reina
Takiguchi, Tetsuya
Ariki, Yasuo
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[7] Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion
Aihara, Ryo
Takiguchi, Tetsuya
Ariki, Yasuo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (07) : 1175 - 1184
[8] INmfCA Algorithm for Training of Nonparallel Voice Conversion Systems Based on Non-Negative Matrix Factorization
Suda, Hitoshi
Kotani, Gaku
Saito, Daisuke
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (06) : 1196 - 1210
[9] ACTIVITY-MAPPING NON-NEGATIVE MATRIX FACTORIZATION FOR EXEMPLAR-BASED VOICE CONVERSION
Aihara, Ryo
Takiguchi, Tetsuya
Ariki, Yasuo
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4899 - 4903
[10] INDIVIDUALITY-PRESERVING VOICE CONVERSION FOR ARTICULATION DISORDERS BASED ON NON-NEGATIVE MATRIX FACTORIZATION
Aihara, Ryo
Takashima, Ryoichi
Takiguchi, Tetsuya
Ariki, Yasuo
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8037 - 8040

← 1 2 3 4 5 →