SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES

被引：0

作者：

Kefalas, Triantafyllos ^{[1
]}

Vougioukas, Konstantinos ^{[1
]}

Panagakis, Yannis ^{[2
]}

Petridis, Stavros ^{[1
,3
]}

Kossaifi, Jean ^{[1
,3
]}

Pantic, Maja ^{[1
,3
]}

机构：

[1] Imperial Coll London, Dept Comp, London, England

[2] Univ Athens, Dept Informat & Telecommun, Athens, Greece

[3] Samsung AI Ctr, Cambridge, England

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

英国工程与自然科学研究理事会;

关键词：

multiview learning; tensor factorization; deep learning; GAN; audiovisual learning;

D O I：

10.1109/icassp40776.2020.9054469

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech-driven facial animation involves using a speech signal to generate realistic videos of talking faces. Recent deep learning approaches to facial synthesis rely on extracting low-dimensional representations and concatenating them, followed by a decoding step of the concatenated vector. This accounts for only first-order interactions of the features and ignores higher-order interactions. In this paper we propose a polynomial fusion layer that models the joint representation of the encodings by a higher-order polynomial, with the parameters modelled by a tensor decomposition. We demonstrate the suitability of this approach through experiments on generated videos evaluated on a range of metrics on video quality, audiovisual synchronisation and generation of blinks.

引用

页码：3487 / 3491

页数：5

共 50 条

[21] Speech-driven 3D Facial Animation for Mobile Entertainment
Yan, Juan
Xie, Xiang
Hu, Hao
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2334 - 2337
[22] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Fan, Yingruo
Lin, Zhaojiang
Saito, Jun
Wang, Wenping
Komura, Taku
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18749 - 18758
[23] Speech-driven animation with meaningful behaviors
Sadoughi, Najmeh
Busso, Carlos
SPEECH COMMUNICATION, 2019, 110 : 90 - 100
[24] CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning
Zhang, Xitie
Wu, Suping
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1175 - 1179
[25] Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation
Liu, Jingying
Hui, Binyuan
Li, Kun
Liu, Yunke
Lai, Yu-Kun
Zhang, Yuxiang
Liu, Yebin
Yang, Jingyu
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (12) : 4873 - 4886
[26] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Xing, Jinbo
Xia, Menghan
Zhang, Yuechen
Cun, Xiaodong
Wang, Jue
Wong, Tien-Tsin
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12780 - 12790
[27] Audio-to-Visual Conversion Via HMM Inversion for Speech-Driven Facial Animation
Terissi, Lucas D.
Gomez, Juan Carlos
ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2008, PROCEEDINGS, 2008, 5249 : 33 - 42
[28] Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Fu, Hui
Wang, Zeqing
Gong, Ke
Wang, Keze
Chen, Tianshui
Li, Haojie
Zeng, Haifeng
Kang, Wenxiong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1770 - 1777
[29] KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
Xu, Zhihao
Gong, Shengjie
Tang, Jiapeng
Liang, Lingyu
Huang, Yining
Li, Haojie
Huang, Shuangping
COMPUTER VISION - ECCV 2024, PT LVI, 2025, 15114 : 236 - 253
[30] Speech-Driven 3D Face Animation with Composite and Regional Facial Movements
Wu, Haozhe
Zhou, Songtao
Jia, Jia
Xing, Junliang
Wen, Qi
Wen, Xiang
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6822 - 6830

← 1 2 3 4 5 →