Real-time speech-driven face animation with expressions using neural networks

被引:70
|
作者
Hong, PY [1 ]
Wen, Z [1 ]
Huang, TS [1 ]
机构
[1] Univ Illinois, Beckman Inst Adv Sci & Technol, Urbana, IL 61801 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2002年 / 13卷 / 04期
基金
美国国家科学基金会;
关键词
facial deformation modeling; facial motion analysis and synthesis; neural networks; real-time speech-driven; talking face with expressions;
D O I
10.1109/TNN.2002.1021892
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A real-time speech-driven synthetic talking face provides an effective multimodal communication interface in distributed collaboration environments. Nonverbal gestures such as facial expressions are important to human communication and should be considered by speech-driven face animation systems. In this paper, we present a framework that systematically addresses facial deformation modeling, automatic facial motion analysis, and real-time speech-driven face animation with expression using neural networks. Based on this framework, we learn a quantitative visual representation of the facial deformations, called the motion units (MUs). An facial deformation can be approximated by a linear combination of the MUs weighted by MU parameters (MVPs). We develop an MU-based facial motion tracking algorithm which is used to collect an audio-visual training database. Then, we construct a real-time audio-to-MUP mapping by training a set of neural networks using the collected audio-visual training database. The quantitative evaluation of the mapping shows the effectiveness of the proposed approach. Using the proposed method, we develop the functionality of real-time speech-driven face animation with expressions for the iFACE system. Experimental results show that the synthetic expressive talking face of the iFACE system is comparable with a real face in terms of the effectiveness of their influences on bimodal human emotion perception.
引用
收藏
页码:916 / 927
页数:12
相关论文
共 50 条
  • [21] FACE IT!: A PIPELINE FOR REAL-TIME PERFORMANCE-DRIVEN FACIAL ANIMATION
    Barros, Jilliam Maria Diaz
    Golyanik, Vladislav
    Varanasi, Kiran
    Stricker, Didier
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2209 - 2213
  • [22] Speech-Driven 3D Face Animation with Composite and Regional Facial Movements
    Wu, Haozhe
    Zhou, Songtao
    Jia, Jia
    Xing, Junliang
    Wen, Qi
    Wen, Xiang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6822 - 6830
  • [23] Automatic face cloning and animation - Using real-time facial feature tracking and speech acquisition
    Goto, T
    Kshirsagar, S
    Magnenat-Thalmann, N
    IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (03) : 17 - 25
  • [24] REALTIME SPEECH-DRIVEN FACIAL ANIMATION USING GAUSSIAN MIXTURE MODELS
    Luo, Changwei
    Yu, Jun
    Li, Xian
    Wang, Zengfu
    2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2014,
  • [25] SPACE : Speech-driven Portrait Animation with Controllable Expression
    Gururani, Siddharth
    Mallya, Arun
    Wang, Ting-Chun
    Valle, Rafael
    Liu, Ming-Yu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20857 - 20866
  • [26] Robust real-time face detection using hybrid neural networks
    Kim, Ho-Joon
    Lee, Juho
    Yang, Hyun-Seung
    COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 721 - 730
  • [27] Real-time lip-synch face animation driven by human voice
    Huang, FJ
    Chen, TH
    1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 352 - 357
  • [28] Speech-driven Face Reenactment for a Video Sequence
    Nakashima, Yuta
    Yasui, Takaaki
    Nguyen, Leon
    Babaguchi, Noboru
    ITE TRANSACTIONS ON MEDIA TECHNOLOGY AND APPLICATIONS, 2020, 8 (01): : 60 - 68
  • [29] Real-Time Detection of Face Mask Usage Using Convolutional Neural Networks
    Kanavos, Athanasios
    Papadimitriou, Orestis
    Al-Hussaeni, Khalil
    Maragoudakis, Manolis
    Karamitsos, Ioannis
    COMPUTERS, 2024, 13 (07)
  • [30] Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks
    Asadiabadi, Sasan
    Sadiq, Rizwan
    Erzin, Engin
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1508 - 1512