MOS and pair comparison combined methods for quality evaluation of text-to-speech systems

被引:0
|
作者
Salza, PL
Foti, E
Nebbia, L
Oreglia, M
机构
来源
ACUSTICA | 1996年 / 82卷 / 04期
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The overall quality of three Text-To-Speech (TTS) synthesis systems for Italian with common prosodic control but different diphones and synthesizers was evaluated by means of the combined application of Mean Opinion Score and Pair Comparison methods. Direct comparison between the two methods serves to validate MOS, which is the the technique recommended by CCITT for synthesis evaluation. In the MOS experiment, assessment also included three types of natural speech (normal and degraded) as reference. Eighteen subjects expressed 2880 MOS judgements and made 720 comparisons in all. The results obtained from the two methods showed good agreement. The most important MOS voice parameters used by listeners for differentiating the systems were Global Impression, Voice, Articulation and Pronunciation. The diphones appeared to contribute most to the different judgements, whereas synthesizers were not perceived as different by listeners. This experiment provides positive verification of interlaboratory reproducibility of MOS, which proved to be an effective technique for overall assessment of TTS quality.
引用
收藏
页码:650 / 656
页数:7
相关论文
共 50 条
  • [1] Subjective evaluation and comparison of the speech quality of text-to-speech systems for the German language
    Klaus, H.
    Fellbaum, K.
    Sotscheck, J.
    Acta Acustica (Stuttgart), 1997, 83 (01): : 124 - 136
  • [2] Subjective evaluation and comparison of the speech quality of text-to-speech systems for the German language
    Klaus, H
    Fellbaum, K
    Sotscheck, J
    ACUSTICA, 1997, 83 (01): : 124 - 136
  • [3] Objective evaluation methods for Chinese Text-To-Speech systems
    Zhang, Teng
    Chen, Zhipeng
    Wu, Ji
    Lail, Sam
    Lei, Wenhui
    Isert, Carsten
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 332 - 336
  • [4] Comparison of measures of speech quality for listening tests of text-to-speech systems
    Viswanathan, M
    Viswanathan, M
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14
  • [5] Comparison of Approaches for Instrumentally Predicting the Quality of Text-To-Speech Systems
    Moeller, Sebastian
    Hinterleitner, Florian
    Falk, Tiago H.
    Polzehl, Tim
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1325 - +
  • [6] Enhancing the Quality of Nepali Text-to-Speech Systems
    Ghimire, Rupak Raj
    Bal, Bal Krishna
    CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, (CIT&DS), 2017, 754 : 187 - 197
  • [7] Perceptual Quality Dimensions of Text-to-Speech Systems
    Hinterleitner, Florian
    Moeller, Sebastian
    Norrenbrock, Christoph
    Heute, Ulrich
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2188 - 2191
  • [8] Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale
    Viswanathan, M
    Viswanathan, M
    COMPUTER SPEECH AND LANGUAGE, 2005, 19 (01): : 55 - 83
  • [9] Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech
    Choi, Yeunju
    Jung, Youngmoon
    Suh, Youngjoo
    Kim, Hoirin
    IEEE ACCESS, 2022, 10 : 52621 - 52629
  • [10] Comparison of the ITU-T P.85 Standard to Other Methods for the Evaluation of Text-to-Speech Systems
    Sityaev, Dmitry
    Knill, Katherine
    Burrows, Tina
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1077 - 1080