MOS and pair comparison combined methods for quality evaluation of text-to-speech systems

被引：0

作者：

Salza, PL

Foti, E

Nebbia, L

Oreglia, M

机构：

来源：

ACUSTICA | 1996年 / 82卷 / 04期

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The overall quality of three Text-To-Speech (TTS) synthesis systems for Italian with common prosodic control but different diphones and synthesizers was evaluated by means of the combined application of Mean Opinion Score and Pair Comparison methods. Direct comparison between the two methods serves to validate MOS, which is the the technique recommended by CCITT for synthesis evaluation. In the MOS experiment, assessment also included three types of natural speech (normal and degraded) as reference. Eighteen subjects expressed 2880 MOS judgements and made 720 comparisons in all. The results obtained from the two methods showed good agreement. The most important MOS voice parameters used by listeners for differentiating the systems were Global Impression, Voice, Articulation and Pronunciation. The diphones appeared to contribute most to the different judgements, whereas synthesizers were not perceived as different by listeners. This experiment provides positive verification of interlaboratory reproducibility of MOS, which proved to be an effective technique for overall assessment of TTS quality.

引用

页码：650 / 656

页数：7

共 50 条

[41] DIPHONES EVALUATION FOR TEXT-TO-SPEECH SYNTHESIS OF ITALIAN.
Salza, P.L.
Sandri, S.
Foti, E.
CSELT Technical Reports, 1988, 16 (01): : 9 - 11
[42] Experiments with training corpora for statistical text-to-speech systems
Podsiadlo, Monika
Ungureanu, Victor
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2002 - 2006
[43] Building Text-to-Speech Systems for Resource Poor Languages
Samsudin, Nur-Hana
Lee, Mark
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3327 - 3334
[44] Development of Prototype Text-to-Speech Systems for Northern Sotho
Oosthuizen, H. J.
Phihlela, S. T.
Manamela, M. J. D.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1348 - 1351
[45] Computerized speech simulation: Subjective evaluation of an Italian text-to-speech synthesizer
Roccetti, M
Salomoni, P
Collinelli, I
SIMULATION IN INDUSTRY 2001, 2001, : 364 - 368
[46] Evaluation of Prosody in Text-to-Speech Synthesis System of Bangla
Basu, Tulika
Saha, Arup
2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
[47] Constructing text-to-speech systems for languages with unknown pronunciations
Sawada, Kei
Hashimoto, Kei
Oura, Keiichiro
Nankaku, Yoshihiko
Tokuda, Keiichi
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (02) : 119 - 129
[48] Romanian language statistics and resources for text-to-speech systems
Stan, Adriana
Giurgiu, Mircea
2010 9TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2010, : 381 - 384
[49] Predicting the Quality of Text-To-Speech Systems from a Large-Scale Feature Set
Hinterleitner, Florian
Norrenbrock, Christoph R.
Moeller, Sebastian
Heute, Ulrich
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 383 - 387
[50] EVALUATING TEXT-TO-SPEECH SYSTEMS - SOME METHODOLOGICAL ASPECTS
VANBEZOOIJEN, R
POLS, LCW
SPEECH COMMUNICATION, 1990, 9 (04) : 263 - 270

← 1 2 3 4 5 →