On the Importance of Audiovisual Coherence for the Perceived Quality of Synthesized Visual Speech

被引:14
|
作者
Mattheyses, Wesley [1 ]
Latacz, Lukas [1 ]
Verhelst, Werner [1 ]
机构
[1] Vrije Univ Brussel, Interdisciplinary Inst Broadband Technol IBBT, Dept ETRO DSSP, B-1050 Brussels, Belgium
关键词
SYNTHETIC TALKING FACES;
D O I
10.1155/2009/169819
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Audiovisual text-to-speech systems convert a written text into an audiovisual speech signal. Typically, the visual mode of the synthetic speech is synthesized separately from the audio, the latter being either natural or synthesized speech. However, the perception of mismatches between these two information streams requires experimental exploration since it could degrade the quality of the output. In order to increase the intermodal coherence in synthetic 2D photorealistic speech, we extended the well-known unit selection audio synthesis technique to work with multimodal segments containing original combinations of audio and video. Subjective experiments confirm that the audiovisual signals created by our multimodal synthesis strategy are indeed perceived as being more synchronous than those of systems in which both modes are not intrinsically coherent. Furthermore, it is shown that the degree of coherence between the auditory mode and the visual mode has an influence on the perceived quality of the synthetic visual speech fragment. In addition, the audio quality was found to have only a minor influence on the perceived visual signal's quality. Copyright (C) 2009 Wesley Mattheyses et al.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] SPEECH QUALITY MEASUREMENT METHODS FOR SYNTHESIZED SPEECH
    KITAWAKI, N
    ITOH, K
    KAKEHI, K
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1981, 29 (9-10): : 895 - 906
  • [22] IMPROVING QUALITY OF SYNTHESIZED SPEECH
    SAPOZHKOV, MA
    SOVIET PHYSICS ACOUSTICS-USSR, 1972, 17 (04): : 510 - +
  • [23] QUALITY IMPROVEMENT OF SYNTHESIZED SPEECH
    YASUHIRO, T
    ACUSTICA, 1982, 50 (03): : 213 - 220
  • [24] Effects of Visual Speech Envelope on Audiovisual Speech Perception in Multitalker Listening Environments
    Yuan, Yi
    Meyers, Kelli
    Borges, Kayla
    Lleo, Yasneli
    Fiorentino, Katarina A.
    Oh, Yonghee
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2021, 64 (07): : 2845 - 2853
  • [25] The influence of selective attention to auditory and visual speech on the integration of audiovisual speech information
    Buchan, Julie N.
    Munhall, Kevin G.
    PERCEPTION, 2011, 40 (10) : 1164 - 1182
  • [26] Predicting the Importance of Freedom of Speech and the Perceived Harm of Hate Speech
    Downs, Daniel M.
    Cowan, Gloria
    JOURNAL OF APPLIED SOCIAL PSYCHOLOGY, 2012, 42 (06) : 1353 - 1375
  • [27] Towards a Perceived Audiovisual Quality Model for Immersive Content
    Fela, Randy Frans
    Zacharov, Nick
    Forchhammer, Soren
    2020 TWELFTH INTERNATIONAL CONFERENCE ON QUALITY OF MULTIMEDIA EXPERIENCE (QOMEX), 2020,
  • [28] Developmental Shifts in Detection and Attention for Auditory, Visual and Audiovisual Speech
    Jerger, Susan
    Damian, Markus F.
    Karl, Cassandra
    Abdi, Herve
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2018, 61 (12): : 3095 - 3112
  • [29] Effects of horizontal viewing angle on visual and audiovisual speech recognition
    Jordan, TR
    Thomas, SM
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2001, 27 (06) : 1386 - 1403
  • [30] Effects of horizontal viewing angle on visual and audiovisual speech perception
    Jordan, T
    Sergeant, P
    Martin, C
    Thomas, S
    Thow, E
    SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 1626 - 1631