Comparison of phoneme and viseme based acoustic units for speech driven realistic lip animation

被引:0
|
作者
Bozkurt, Elif [1 ]
Erdem, Cigdem Eroglu [1 ]
Erzin, Engin [2 ]
Erdem, T. [1 ]
Oezkan, Mehmet [1 ]
机构
[1] TUBITAK MAM TEKSEB, A-205, Gebze, Kocaeli, Turkey
[2] Koc Univ, Dept Elect & Elect Engn, Istanbul, Turkey
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural looking lip animation, synchronized with incoming speech, is essential for realistic character animation. In this work, we evaluate the performance of phone and viseme based acoustic units, with and without context information, for generating realistic lip synchronization using HMM based recognition systems. We conclude via objective evaluations that utilization of viseme based units with context information outperforms the other methods.
引用
收藏
页码:422 / +
页数:2
相关论文
共 50 条
  • [21] A HYBRID PHONEME BASED CLUSTERING APPROACH FOR AUDIO DRIVEN FACIAL ANIMATION
    Havell, Benjamin
    Rosin, Paul L.
    Sanei, Saeid
    Aubrey, Andrew
    Marshall, David
    Hicks, Yulia
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2261 - 2264
  • [22] Lip animation based on observed 3D speech dynamics
    Kalberer, GA
    Van Gool, L
    VIDEOMETRICS AND OPTICAL METHODS FOR 3D SHAPE MEASUREMENT, 2001, 4309 : 16 - 25
  • [23] Speech driven facial animation generation based on GAN
    Li, Xiong
    Zhang, Jiye
    Liu, Yazhi
    DISPLAYS, 2022, 74
  • [24] Conversion from Phoneme Based to Grapheme Based Acoustic Models for Speech Recognition
    Zgank, Andrej
    Kacic, Zdravko
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1587 - 1590
  • [25] Realistic Face Animation for Audiovisual Speech Applications: A Densification Approach Driven by Sparse Stereo Meshes
    Berger, Marie-Odile
    Ponroy, Jonathan
    Wrobel-Dautcourt, Brigitte
    COMPUTER VISION/COMPUTER GRAPHICS COLLABORATION TECHNIQUES, PROCEEDINGS, 2009, 5496 : 297 - 307
  • [26] Realistic speech animation based on observed 3-D face dynamics
    Müller, P
    Kalberer, GA
    Proesmans, M
    Van Gool, L
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2005, 152 (04): : 491 - 500
  • [27] Speech driven face animation based on dynamic concatenation model
    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
    J. Inf. Comput. Sci., 2007, 1 (271-280):
  • [28] Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features
    Kiss, Gabor
    Sztaho, David
    Vicsi, Klara
    2013 IEEE 4TH INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2013, : 579 - 582
  • [29] Realistic Lip Animation from Speech for Unseen Subjects using Few-shot Cross-modal Learning
    Agarwal, Swapna
    Das, Dipanjan
    Bhowmick, Brojeshwar
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 690 - 694
  • [30] PHONEME CLASS BASED ADAPTATION FOR MISMATCH ACOUSTIC MODELING OF DISTANT NOISY SPEECH
    Uluskan, Seckin
    Hansen, John H. L.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1778 - 1781