MOS and pair comparison combined methods for quality evaluation of text-to-speech systems

被引:0
|
作者
Salza, PL
Foti, E
Nebbia, L
Oreglia, M
机构
来源
ACUSTICA | 1996年 / 82卷 / 04期
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The overall quality of three Text-To-Speech (TTS) synthesis systems for Italian with common prosodic control but different diphones and synthesizers was evaluated by means of the combined application of Mean Opinion Score and Pair Comparison methods. Direct comparison between the two methods serves to validate MOS, which is the the technique recommended by CCITT for synthesis evaluation. In the MOS experiment, assessment also included three types of natural speech (normal and degraded) as reference. Eighteen subjects expressed 2880 MOS judgements and made 720 comparisons in all. The results obtained from the two methods showed good agreement. The most important MOS voice parameters used by listeners for differentiating the systems were Global Impression, Voice, Articulation and Pronunciation. The diphones appeared to contribute most to the different judgements, whereas synthesizers were not perceived as different by listeners. This experiment provides positive verification of interlaboratory reproducibility of MOS, which proved to be an effective technique for overall assessment of TTS quality.
引用
收藏
页码:650 / 656
页数:7
相关论文
共 50 条
  • [41] DIPHONES EVALUATION FOR TEXT-TO-SPEECH SYNTHESIS OF ITALIAN.
    Salza, P.L.
    Sandri, S.
    Foti, E.
    CSELT Technical Reports, 1988, 16 (01): : 9 - 11
  • [42] Experiments with training corpora for statistical text-to-speech systems
    Podsiadlo, Monika
    Ungureanu, Victor
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2002 - 2006
  • [43] Building Text-to-Speech Systems for Resource Poor Languages
    Samsudin, Nur-Hana
    Lee, Mark
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3327 - 3334
  • [44] Development of Prototype Text-to-Speech Systems for Northern Sotho
    Oosthuizen, H. J.
    Phihlela, S. T.
    Manamela, M. J. D.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1348 - 1351
  • [45] Computerized speech simulation: Subjective evaluation of an Italian text-to-speech synthesizer
    Roccetti, M
    Salomoni, P
    Collinelli, I
    SIMULATION IN INDUSTRY 2001, 2001, : 364 - 368
  • [46] Evaluation of Prosody in Text-to-Speech Synthesis System of Bangla
    Basu, Tulika
    Saha, Arup
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [47] Constructing text-to-speech systems for languages with unknown pronunciations
    Sawada, Kei
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (02) : 119 - 129
  • [48] Romanian language statistics and resources for text-to-speech systems
    Stan, Adriana
    Giurgiu, Mircea
    2010 9TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2010, : 381 - 384
  • [49] Predicting the Quality of Text-To-Speech Systems from a Large-Scale Feature Set
    Hinterleitner, Florian
    Norrenbrock, Christoph R.
    Moeller, Sebastian
    Heute, Ulrich
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 383 - 387
  • [50] EVALUATING TEXT-TO-SPEECH SYSTEMS - SOME METHODOLOGICAL ASPECTS
    VANBEZOOIJEN, R
    POLS, LCW
    SPEECH COMMUNICATION, 1990, 9 (04) : 263 - 270