Audio-Driven Facial Animation with Deep Learning: A Survey

被引：0

作者：

Jiang, Diqiong ^{[1
]}

Chang, Jian ^{[1
]}

You, Lihua ^{[1
]}

Bian, Shaojun ^{[2
]}

Kosk, Robert ^{[1
]}

Maguire, Greg ^{[3
]}

机构：

[1] Bournemouth Univ, Natl Ctr Comp Animat, Poole BH12 5BB, England

[2] Buckinghamshire New Univ, Sch Creat & Digital Ind, High Wycombe HP11 2JZ, England

[3] Ulster Univ, Belfast Sch Art, Belfast BT15 1ED, North Ireland

来源：

INFORMATION | 2024年 / 15卷 / 11期

基金：

欧盟地平线“2020”;

关键词：

deep learning; audio processing; talking head; face generation; AUDIOVISUAL CORPUS; SPEECH;

D O I：

10.3390/info15110675

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology.

引用

页数：24

共 50 条

[1] Multi-Task Audio-Driven Facial Animation
Kim, Youngsoo
An, Shounan
Jo, Youngbak
Park, Seungje
Kang, Shindong
Oh, Insoo
Kim, Duke Donghyun
SIGGRAPH '19 - ACM SIGGRAPH 2019 POSTERS, 2019,
[2] Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
Karras, Tero
Aila, Timo
Laine, Samuli
Herva, Antti
Lehtinen, Jaakko
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):
[3] VisemeNet: Audio-Driven Animator-Centric Speech Animation
Zhou, Yang
Xu, Zhan
Landreth, Chris
Kalogerakis, Evangelos
Maji, Subhransu
Singh, Karan
ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
[4] UniTalker: Scaling up Audio-Driven 3D Facial Animation Through A Unified Model
Fan, Xiangyu
Li, Jiaqi
Lin, Zhiqian
Xiao, Weiye
Yang, Lei
COMPUTER VISION - ECCV 2024, PT XLI, 2025, 15099 : 204 - 221
[5] Audio-Driven Violin Performance Animation with Clear Fingering and Bowing
Hirata, Asuka
Tanaka, Keitaro
Hamanaka, Masatoshi
Morishima, Shigeo
PROCEEDINGS OF SIGGRAPH 2022 POSTERS, SIGGRAPH 2022, 2022,
[6] Audio-driven emotional speech animation for interactive virtual characters
Charalambous, Constantinos
Yumak, Zerrin
van der Stappen, A. Frank
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (3-4)
[7] Personalized Audio-Driven 3D Facial Animation via Style-Content Disentanglement
Chai, Yujin
Shao, Tianjia
Weng, Yanlin
Zhou, Kun
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (03) : 1803 - 1820
[8] EmoFace: Audio-driven Emotional 3D Face Animation
Liu, Chang
Lin, Qunfen
Zeng, Zijiao
Pan, Ye
2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 387 - 397
[9] Audio2AB:Audio-driven collaborative generation of virtual character animation
Lichao NIU
Wenjun XIE
Dong WANG
Zhongrui CAO
Xiaoping LIU
虚拟现实与智能硬件(中英文), 2024, 6 (01) : 56 - 70
[10] Audio2AB: Audio-driven collaborative generation of virtual character animation
Niu L.
Xie W.
Wang D.
Cao Z.
Liu X.
Virtual Reality and Intelligent Hardware, 6 (01): : 56 - 70

← 1 2 3 4 5 →