Audio-Driven Facial Animation with Deep Learning: A Survey

被引:0
|
作者
Jiang, Diqiong [1 ]
Chang, Jian [1 ]
You, Lihua [1 ]
Bian, Shaojun [2 ]
Kosk, Robert [1 ]
Maguire, Greg [3 ]
机构
[1] Bournemouth Univ, Natl Ctr Comp Animat, Poole BH12 5BB, England
[2] Buckinghamshire New Univ, Sch Creat & Digital Ind, High Wycombe HP11 2JZ, England
[3] Ulster Univ, Belfast Sch Art, Belfast BT15 1ED, North Ireland
基金
欧盟地平线“2020”;
关键词
deep learning; audio processing; talking head; face generation; AUDIOVISUAL CORPUS; SPEECH;
D O I
10.3390/info15110675
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Multi-Task Audio-Driven Facial Animation
    Kim, Youngsoo
    An, Shounan
    Jo, Youngbak
    Park, Seungje
    Kang, Shindong
    Oh, Insoo
    Kim, Duke Donghyun
    SIGGRAPH '19 - ACM SIGGRAPH 2019 POSTERS, 2019,
  • [2] Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
    Karras, Tero
    Aila, Timo
    Laine, Samuli
    Herva, Antti
    Lehtinen, Jaakko
    ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):
  • [3] VisemeNet: Audio-Driven Animator-Centric Speech Animation
    Zhou, Yang
    Xu, Zhan
    Landreth, Chris
    Kalogerakis, Evangelos
    Maji, Subhransu
    Singh, Karan
    ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
  • [4] UniTalker: Scaling up Audio-Driven 3D Facial Animation Through A Unified Model
    Fan, Xiangyu
    Li, Jiaqi
    Lin, Zhiqian
    Xiao, Weiye
    Yang, Lei
    COMPUTER VISION - ECCV 2024, PT XLI, 2025, 15099 : 204 - 221
  • [5] Audio-Driven Violin Performance Animation with Clear Fingering and Bowing
    Hirata, Asuka
    Tanaka, Keitaro
    Hamanaka, Masatoshi
    Morishima, Shigeo
    PROCEEDINGS OF SIGGRAPH 2022 POSTERS, SIGGRAPH 2022, 2022,
  • [6] Audio-driven emotional speech animation for interactive virtual characters
    Charalambous, Constantinos
    Yumak, Zerrin
    van der Stappen, A. Frank
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (3-4)
  • [7] Personalized Audio-Driven 3D Facial Animation via Style-Content Disentanglement
    Chai, Yujin
    Shao, Tianjia
    Weng, Yanlin
    Zhou, Kun
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (03) : 1803 - 1820
  • [8] EmoFace: Audio-driven Emotional 3D Face Animation
    Liu, Chang
    Lin, Qunfen
    Zeng, Zijiao
    Pan, Ye
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 387 - 397
  • [9] Audio2AB:Audio-driven collaborative generation of virtual character animation
    Lichao NIU
    Wenjun XIE
    Dong WANG
    Zhongrui CAO
    Xiaoping LIU
    虚拟现实与智能硬件(中英文), 2024, 6 (01) : 56 - 70
  • [10] Audio2AB: Audio-driven collaborative generation of virtual character animation
    Niu L.
    Xie W.
    Wang D.
    Cao Z.
    Liu X.
    Virtual Reality and Intelligent Hardware, 6 (01): : 56 - 70