Accurate Visual Speech Synthesis Based on Diviseme Unit Selection and Concatenation

被引:0
|
作者
Jiang, Dongmei [1 ]
Ravyse, Ilse [2 ]
Sahli, Hichem [2 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Joint Res Grp Audio Visual Signal Proc, 127 Youyi Xilu, Xian 710072, Peoples R China
[2] Vrije Univ Brussel, Dept ETRO, B-1050 Brussels, Belgium
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel speech driven accurate realistic visual speech synthesis approach. Firstly, an audio visual instance database is built for different viseme context combinations, i.e. diviseme units, using 100 audio visual speech sentences of a female speaker. Then a diviseme instance selection algorithm is introduced to choose the optimal diviseme instances for the viseme contexts in the input speech, considering both the concatenation smoothness of the image sequences, and matching of the mouth movements to the acoustic pronunciation process, as well the intensity of the input speech. Finally mouth image sequences of corresponding viseme segments in the selected diviseme instances are time warped and blended to construct the mouth images of the final animation. Visual speech synthesis experiments and subjective evaluation results show that mouth animations can he obtained which are not only realistic with clear and smooth mouth images, but also in good accordance with the acoustic pronunciation and intensity of the input speech.
引用
收藏
页码:910 / +
页数:2
相关论文
共 50 条
  • [41] Polish unit selection speech synthesis with BOSS: extensions and speech corpora
    Demenko, Grazyna
    Klessa, Katarzyna
    Szymanski, Marcin
    Breuer, Stefan
    Hess, Wolfgang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2010, 13 (02) : 85 - 99
  • [42] Quality Improvements of Zero-Concatenation-Cost Chain Based Unit Selection
    Kala, Jiri
    Matousek, Jindrich
    SPEECH AND COMPUTER, 2014, 8773 : 376 - 385
  • [43] A comparison of spectral smoothing methods for segment concatenation based speech synthesis
    Chappell, DT
    Hansen, JHL
    SPEECH COMMUNICATION, 2002, 36 (3-4) : 343 - 374
  • [44] A method for combining intonation modelling and speech unit selection in corpus-based speech synthesis systems
    Diaz, Francisco Campillo
    Rodriguez Banga, Eduardo
    SPEECH COMMUNICATION, 2006, 48 (08) : 941 - 956
  • [45] Speech unit selection based on matching pursuit
    Hosseinpour, M.
    Ranjbar, M. N.
    Mousavinejad, M.
    2007 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2007, : 535 - +
  • [46] Expressive Prosody for Unit-selection Speech Synthesis
    Strom, Volker
    Clark, Robert
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
  • [47] On the Impact of Labialization Contexts on Unit Selection Speech Synthesis
    Tihelka, Daniel
    Hanzlicek, Zdenek
    Machac, Pavel
    Skarnitzl, Radek
    Matousek, Jindrich
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 187 - 192
  • [48] Towards Intonation Control in Unit Selection Speech Synthesis
    Boidin, Cedric
    Boeffard, Olivier
    Moudenc, Thierry
    Damnati, Geraldine
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 736 - +
  • [49] The Target Cost Formulation in Unit Selection Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2038 - 2041
  • [50] On the Role of Spectral Dynamics in Unit Selection Speech Synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2029 - 2032