Accurate Visual Speech Synthesis Based on Diviseme Unit Selection and Concatenation

被引:0
|
作者
Jiang, Dongmei [1 ]
Ravyse, Ilse [2 ]
Sahli, Hichem [2 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Joint Res Grp Audio Visual Signal Proc, 127 Youyi Xilu, Xian 710072, Peoples R China
[2] Vrije Univ Brussel, Dept ETRO, B-1050 Brussels, Belgium
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel speech driven accurate realistic visual speech synthesis approach. Firstly, an audio visual instance database is built for different viseme context combinations, i.e. diviseme units, using 100 audio visual speech sentences of a female speaker. Then a diviseme instance selection algorithm is introduced to choose the optimal diviseme instances for the viseme contexts in the input speech, considering both the concatenation smoothness of the image sequences, and matching of the mouth movements to the acoustic pronunciation process, as well the intensity of the input speech. Finally mouth image sequences of corresponding viseme segments in the selected diviseme instances are time warped and blended to construct the mouth images of the final animation. Visual speech synthesis experiments and subjective evaluation results show that mouth animations can he obtained which are not only realistic with clear and smooth mouth images, but also in good accordance with the acoustic pronunciation and intensity of the input speech.
引用
收藏
页码:910 / +
页数:2
相关论文
共 50 条
  • [21] Concatenative speech synthesis based on the plural unit selection and fusion method
    Mizutani, T
    Kagoshima, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
  • [22] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
    Lakkavalli, Vikram Ramesh
    Arulmozhi, P.
    Ramakrishnan, A. G.
    2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,
  • [23] Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis
    Windmann, Andreas
    Jauk, Igor
    Tamburini, Fabio
    Wagner, Petra
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 332 - +
  • [24] English speech synthesis using CART-based unit selection
    Pei, Dingyu
    Chai, Peiqi
    Zeng, Lingping
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (03): : 223 - 225
  • [25] AlpSynth - Concatenation-based speech synthesis for the Slovenian language
    Gros, JZ
    Mihelic, A
    Pavesic, N
    Zganec, M
    Gruden, S
    Proceedings ELMAR-2005, 2005, : 213 - 216
  • [26] Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis
    Sakai, Shinsuke
    Kawahara, Tatsuya
    Kawai, Hisashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (10): : 2006 - 2014
  • [27] An embedded English synthesis approach based on speech concatenation and smoothing
    Chen, GL
    Yue, DJ
    Zu, YQ
    Yu, ZL
    2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 157 - 160
  • [28] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
    Moers, Donata
    Wagner, Petra
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
  • [29] Implementation and verification of speech database for unit selection speech synthesis
    Szklanny, Krzysztof
    Koszuta, Sebastian
    PROCEEDINGS OF THE 2017 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2017, : 1263 - 1267
  • [30] Unit Selection Model in Arabic Speech Synthesis
    Al-Saiyd, Nedhal A.
    Hijjawi, Mohammad
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (04): : 126 - 131