Audio-Driven Lips and Expression on 3D Human Face

被引：0

作者：

Ma, Le ^{[1
,2
]}

Ma, Zhihao ^{[1
,2
]}

Meng, Weiliang ^{[1
,2
]}

Xu, Shibiao ^{[3
]}

Zhang, Xiaopeng ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II | 2024年 / 14496卷

基金：

中国国家自然科学基金;

关键词：

Face Expression; Lips Movement; Fusion; DATABASE;

D O I：

10.1007/978-3-031-50072-5_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Extensive research has delved into audio-driven 3D facial animation with numerous attempts to achieve human-like performance. However, creating truly realistic and expressive 3D facial animations remains a challenging task, as existing methods often struggle to capture the subtle nuances of anthropomorphic expressions. We propose the Audio-Driven Lips and Expression (ADLE) method, specifically designed to generate highly expressive and lifelike conversations between individuals, complete with essential social signals like laughter and excitement, solely based on audio cues. The foundation of our approach lies in the revolutionary audio-expression-consistency strategy, which effectively disentangles person-specific lip movements from dependent facial expressions. As a result, our ADLE robustly learns lip movements and generic expression parameters on a 3D human face from an audio sequence, which represents a powerful multimodal fusion approach capable of generating accurate lip movements paired with vivid facial expressions on a 3D face, all in real-time. Experiments validates that our ADLE outperforms other state-of-the-art works in this field, making it a highly promising approach for a wide range of applications.

引用

页码：15 / 26

页数：12

共 50 条

[31] On Decomposing an Unseen 3D Face into Neutral Face and Expression Deformations
Al-Osaimi, Faisal R.
Bennamoun, Mohammed
Mian, Ajmal
ADVANCES IN BIOMETRICS, 2009, 5558 : 22 - 31
[32] 3D Face Animation Generation from Audio Using Convolutional Networks
Unlu, Turker
Inceoglu, Arda
Yilmaz, Erkan Ozgur
Sariel, Sanem
2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
[33] ROBUSTNESS AND EXPRESSION INDEPENDENCE IN 3D FACE RECOGNITION
Miao, Shun
Krim, Hamid
2011 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2011, : 289 - 292
[34] Expression-invariant 3D face recognition
Bronstein, AM
Bronstein, MM
Kimmel, R
AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 62 - 69
[35] Effects on facial expression in 3D face recognition
Chang, K
Bowyer, K
Flynn, P
BIOMETRIC TECHNOLOGY FOR HUMAN IDENTIFICATION II, 2005, 5779 : 132 - 143
[36] 3D modeling system of human face and full 3D facial caricaturing
Fujiwara, T
Koshimizu, H
Fujimura, K
Kihara, H
Noguchi, Y
Ishikawa, N
THIRD INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2001, : 385 - 392
[37] 3D Modeling system of human face and full 3D facial caricaturing
Fujiwara, T
Koshimizu, H
Fujimura, K
Fujita, G
Noguchi, Y
Ishikawa, N
VSMM 2001: SEVENTH INTERNATIONAL CONFERENCE ON VIRTUAL SYSTEMS AND MULTIMEDIA, PROCEEDINGS: ENHANCED REALITIES: AUGMENTED AND UNPLUGGED, 2001, : 625 - 633
[38] CMFF-Face: Attention-Based Cross-Modal Feature Fusion for High-Quality Audio-Driven Talking Face Generation
Zhao, Guangzhe
Liu, Yanan
Wang, Xueping
Yan, Feihu
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 101 - 110
[39] Regular Remeshing of 3D Human Face Models
Yirci, Murat
Ulusoy, Ilkay
2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 367 - +
[40] 3D Face database for human pattern recognition
Song, LiMei
Lu, Lu
SEVENTH INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND CONTROL TECHNOLOGY: SENSORS AND INSTRUMENTS, COMPUTER SIMULATION, AND ARTIFICIAL INTELLIGENCE, 2008, 7127

← 1 2 3 4 5 →