Parallel and High-Fidelity Text-to-Lip Generation

被引:0
|
作者
Liu, Jinglin [1 ]
Zhu, Zhiying [1 ]
Ren, Yi [1 ]
Huang, Wencan [1 ]
Huai, Baoxing [2 ]
Yuan, Nicholas [2 ]
Zhao, Zhou [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[2] Huawei Cloud, Hong Kong, Peoples R China
基金
浙江省自然科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a key component of talking face generation, lip movements generation determines the naturalness and coherence of the generated talking face video. Prior literature mainly focuses on speech-to-lip generation while there is a paucity in text-to-lip (T2L) generation. T2L is a challenging task and existing end-to-end works depend on the attention mechanism and autoregressive (AR) decoding manner. However, the AR decoding manner generates current lip frame conditioned on frames generated previously, which inherently hinders the inference speed, and also has a detrimental effect on the quality of generated lip frames due to error propagation. This encourages the research of parallel T2L generation. In this work, we propose a parallel decoding model for fast and high-fidelity text-to-lip generation (ParaLip). Specifically, we predict the duration of the encoded linguistic features and model the target lip frames conditioned on the encoded linguistic features with their duration in a non-autoregressive manner. Furthermore, we incorporate the structural similarity index loss and adversarial learning to improve perceptual quality of generated lip frames and alleviate the blurry prediction problem. Extensive experiments conducted on GRID and TCD-TIMIT datasets demonstrate the superiority of proposed methods.
引用
收藏
页码:1738 / 1746
页数:9
相关论文
共 50 条
  • [11] A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities
    Aneja, Deepali
    McDuff, Daniel
    Shah, Shital
    ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 69 - 73
  • [12] Cleft Lip Repair Competence Can Be Evaluated with High-fidelity Simulation
    Rogers-Vizena, Carolyn R.
    Yao, Caroline A.
    Sideridis, Georgios D.
    Minahan, Lindsey
    Saldanha, Francesca Y. L.
    Livingston, Katie A.
    Weinstock, Peter H.
    PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN, 2022, 10 (07) : E4435
  • [13] HIFIDENOISE: HIGH-FIDELITY DENOISING TEXT TO SPEECH WITH ADVERSARIAL NETWORKS
    Zhang, Lichao
    Ren, Yi
    Deng, Liqun
    Zhao, Zhou
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7232 - 7236
  • [14] XtalMesh Toolkit: High-Fidelity Mesh Generation of Polycrystals
    Jonathan M. Hestroffer
    Irene J. Beyerlein
    Integrating Materials and Manufacturing Innovation, 2022, 11 : 109 - 120
  • [15] XtalMesh Toolkit: High-Fidelity Mesh Generation of Polycrystals
    Hestroffer, Jonathan M.
    Beyerlein, Irene J.
    INTEGRATING MATERIALS AND MANUFACTURING INNOVATION, 2022, 11 (01) : 109 - 120
  • [16] Unsupervised High-Fidelity Facial Texture Generation and Reconstruction
    Slossberg, Ron
    Jubran, Ibrahim
    Kimmel, Ron
    COMPUTER VISION, ECCV 2022, PT XIII, 2022, 13673 : 212 - 229
  • [17] LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching
    Liang, Yixun
    Yang, Xin
    Lin, Jiantao
    Li, Haodong
    Xu, Xiaogang
    Chen, Yingcong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6517 - 6526
  • [18] Weakly Supervised High-Fidelity Clothing Model Generation
    Feng, Ruili
    Ma, Cheng
    Shen, Chengji
    Gao, Xin
    Liu, Zhenjiang
    Li, Xiaobo
    Ou, Kairi
    Zhao, Deli
    Zha, Zheng-Jun
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3430 - 3439
  • [19] Parallel Implementation of High-Fidelity Multiqubit Gates with Neutral Atoms
    Levine, Harry
    Keesling, Alexander
    Semeghini, Giulia
    Omran, Ahmed
    Wang, Tout T.
    Ebadi, Sepehr
    Bernien, Hannes
    Greiner, Markus
    Vuletic, Vladan
    Pichler, Hannes
    Lukin, Mikhail D.
    PHYSICAL REVIEW LETTERS, 2019, 123 (17)
  • [20] A framework for high-fidelity particle tracking on massively parallel systems
    Kopper, Patrick
    Schwarz, Anna
    Copplestone, Stephen M.
    Ortwein, Philip
    Staudacher, Stephan
    Beck, Andrea
    COMPUTER PHYSICS COMMUNICATIONS, 2023, 289