Sample-based synthesis of photo-realistic talking heads

被引:47
|
作者
Cosatto, E [1 ]
Graf, HP [1 ]
机构
[1] AT&T Bell Labs, Res, Red Bank, NJ 07701 USA
关键词
talking-head synthesis; sample-based synthesis; photo-realistic rendering; face recognition and location; sample-based coarticulation;
D O I
10.1109/CA.1998.681914
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a system that generates photo-realistic video animations of talking heads. First the system derives head models from existing video footage using image recognition techniques. It locates, extracts and labels facial parts such as mouth, eyes, and eyebrows into a compact library. Then, using these face models and a text-to-speech synthesizer, it synthesizes new video sequences of the head where the lips are in synchrony with the accompanying soundtrack. Emotional cues and conversational signals are produced by combining head movements, raising eyebrows, wide open eyes, etc. with the mouth animation. For these animations to be believable, care has to be taken aligning the facial parts so that they blend smoothly into each other and produce seamless animations. Our system uses precise multi-channel facial recognition techniques to track facial parts, and it derives the exact 3D position of the head, enabling the automatic extraction of normalized face parts. Such talking-head animations are useful because they generally increase intelligibility of the human-machine interface in applications where content needs to be narrated to the user, such as educative software.
引用
收藏
页码:103 / 110
页数:8
相关论文
共 50 条
  • [21] Visual search in photo-realistic scenes
    Mingolla, E.
    Cunningham, R. K.
    Beck, J.
    PERCEPTION, 1998, 27 : 65 - 65
  • [22] Photo-realistic simulation and rendering of halos
    Gonzato, JC
    Marchand, S
    WSCG '2001: SHORT COMMUNICATIONS AND POSTERS, 2001, : SH106 - SH113
  • [23] Ray-based creation of photo-realistic virtual world
    Naemura, T
    Takano, T
    Kaneko, M
    Harashima, H
    INTERNATIONAL CONFERENCE ON VIRTUAL SYSTEMS AND MULTIMEDIA - VSMM'97, PROCEEDINGS, 1997, : 59 - 68
  • [24] Photo-realistic image synthesis from lines and appearance with modular modulation
    Luo, Wuyang
    Yang, Su
    Zhang, Weishan
    NEUROCOMPUTING, 2022, 503 : 81 - 91
  • [25] PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping
    Lin, Luoyang
    Jiang, Zutao
    Liang, Xiaodan
    Ma, Liqian
    Kampffmeyer, Michael C.
    Cao, Xiaochun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3441 - 3449
  • [26] Synthesis of Photo-Realistic Facial Animation from Text Based on HMM and DNN with Animation Unit
    Sato, Kazuki
    Nose, Takashi
    Ito, Akinori
    ADVANCES IN INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL 2, 2017, 64 : 29 - 36
  • [27] HIGH QUALITY LIP-SYNC ANIMATION FOR 3D PHOTO-REALISTIC TALKING HEAD
    Wang, Lijuan
    Han, Wei
    Soong, Frank K.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4529 - 4532
  • [28] Toward Photo-Realistic Facial Animation Generation Based on keypoint Features
    Shu, Zikai
    Ito, Akinori
    Nose, Takashi
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 334 - 339
  • [29] Method of estimation of the definition in photo-realistic images
    Say, SV
    SIBCON-2005: IEEE International Siberian Conference on Control and Communications, 2005, : 175 - 177
  • [30] GarmentGAN: Photo-realistic Adversarial Fashion Transfer
    Raffiee, Amir Hossein
    Sollami, Michael
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3923 - 3930