Sample-based synthesis of photo-realistic talking heads

被引:47
|
作者
Cosatto, E [1 ]
Graf, HP [1 ]
机构
[1] AT&T Bell Labs, Res, Red Bank, NJ 07701 USA
关键词
talking-head synthesis; sample-based synthesis; photo-realistic rendering; face recognition and location; sample-based coarticulation;
D O I
10.1109/CA.1998.681914
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a system that generates photo-realistic video animations of talking heads. First the system derives head models from existing video footage using image recognition techniques. It locates, extracts and labels facial parts such as mouth, eyes, and eyebrows into a compact library. Then, using these face models and a text-to-speech synthesizer, it synthesizes new video sequences of the head where the lips are in synchrony with the accompanying soundtrack. Emotional cues and conversational signals are produced by combining head movements, raising eyebrows, wide open eyes, etc. with the mouth animation. For these animations to be believable, care has to be taken aligning the facial parts so that they blend smoothly into each other and produce seamless animations. Our system uses precise multi-channel facial recognition techniques to track facial parts, and it derives the exact 3D position of the head, enabling the automatic extraction of normalized face parts. Such talking-head animations are useful because they generally increase intelligibility of the human-machine interface in applications where content needs to be narrated to the user, such as educative software.
引用
收藏
页码:103 / 110
页数:8
相关论文
共 50 条
  • [31] Fast photo-realistic rendering of trees in daylight
    Qin, XY
    Nakamae, E
    Tadamura, K
    Nagai, Y
    COMPUTER GRAPHICS FORUM, 2003, 22 (03) : 243 - 252
  • [32] Evolved strokes in non photo-realistic rendering
    Izadi, Ashkan
    Ciesiels, Vic
    World Academy of Science, Engineering and Technology, 2010, 43 : 598 - 603
  • [33] Automatic generation of photo-realistic mosaic image
    Park, JS
    Chang, DH
    Park, SG
    BIOLOGICALLY MOTIVATED COMPUTER VISION, PROCEEDING, 2000, 1811 : 343 - 352
  • [34] Towards Photo-Realistic Facial Expression Manipulation
    Geng, Zhenglin
    Cao, Chen
    Tulyakov, Sergey
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (10-11) : 2744 - 2761
  • [35] Towards Photo-Realistic Facial Expression Manipulation
    Zhenglin Geng
    Chen Cao
    Sergey Tulyakov
    International Journal of Computer Vision, 2020, 128 : 2744 - 2761
  • [36] A feature-based deformable model for photo-realistic head modelling
    Liu, YJ
    Yuen, TMF
    Xiong, S
    DEFORMABLE AVATARS, 2001, 68 : 35 - 45
  • [37] Photo-realistic texture mapping for voxel-based volume data
    Weng, TL
    Chang, WY
    Wang, SR
    Sun, YN
    MEDICAL IMAGING 2001: VISUALIZATION, DISPLAY, AND IMAGE-GUIDED PROCEDURES, 2001, 4319 : 386 - 393
  • [38] 3DCGiRAM: An intelligent memory architecture for photo-realistic image synthesis
    Kobayashi, H
    Suzuki, K
    Sano, K
    Kaeriyama, Y
    Saida, Y
    Oba, N
    Nakamura, T
    2001 INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD 2001, PROCEEDINGS, 2001, : 462 - 467
  • [39] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
    Zhang, Han
    Xu, Tao
    Li, Hongsheng
    Zhang, Shaoting
    Wang, Xiaogang
    Huang, Xiaolei
    Metaxas, Dimitris
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5908 - 5916
  • [40] Learning 3D Faces from Photo-Realistic Facial Synthesis
    Wang, Ruizhe
    Chen, Chih-Fan
    Peng, Hao
    Liu, Xudong
    Li, Xin
    2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 858 - 867