Comparison of phoneme and viseme based acoustic units for speech driven realistic lip animation

被引：0

作者：

Bozkurt, Elif ^{[1
]}

Erdem, Cigdem Eroglu ^{[1
]}

Erzin, Engin ^{[2
]}

Erdem, T. ^{[1
]}

Oezkan, Mehmet ^{[1
]}

机构：

[1] TUBITAK MAM TEKSEB, A-205, Gebze, Kocaeli, Turkey

[2] Koc Univ, Dept Elect & Elect Engn, Istanbul, Turkey

来源：

2007 IEEE 15TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1-3 | 2007年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Natural looking lip animation, synchronized with incoming speech, is essential for realistic character animation. In this work, we evaluate the performance of phone and viseme based acoustic units, with and without context information, for generating realistic lip synchronization using HMM based recognition systems. We conclude via objective evaluations that utilization of viseme based units with context information outperforms the other methods.

引用

页码：422 / +

页数：2

共 50 条

[41] An acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech
Ali, AMA
Van der Spiegel, J
Mueller, P
Haentjens, G
Berman, J
ISCAS '99: PROCEEDINGS OF THE 1999 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 3: ANALOG AND DIGITAL SIGNAL PROCESSING, 1999, : 118 - 121
[42] Acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech
Abdelatty Ali, Ahmed M.
Van der Spiegel, Jan
Mueller, Paul
Haentjens, Gavin
Berman, Jeffrey
Proceedings - IEEE International Symposium on Circuits and Systems, 1999, 3
[43] Fuzzy velocity-based temporal dependency for SVM-driven realistic facial animation
Xie, Pith
Chen, Yiqiang
Liu, Junfa
Xia, Dongrong
PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 563 - 572
[44] A Probabilistic Approach to Selecting Units for Speech Synthesis Based on Acoustic Similarity
Babu, Anjana
Krishnan, Raghava K.
Sao, Anil K.
Murthy, Hema A.
2014 TWENTIETH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2014,
[45] Evaluation of a formant-based speech-driven lip motion generation
Ishi, Carlos T.
Liu, Chaoran
Ishiguro, Hiroshi
Hagita, Norihiro
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 114 - 117
[46] Lip contour description based on Fourier descriptors in speech synthesis system driven by visual-speech
School of Precision Instrument and Opto-Electronics Engineering, Tianjin University, Tianjin 300072, China
Yi Qi Yi Biao Xue Bao, 2007, 8 (1464-1468):
[47] Speech driven MPEG-4 based face animation via neural network
Chen, YQ
Gao, W
Wang, ZQ
Zuo, L
ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS, 2001, 2195 : 1108 - 1113
[48] Video-realistic image-based eye animation via statistically driven state machines
Axel Weissenfeld
Kang Liu
Jörn Ostermann
The Visual Computer, 2010, 26 : 1201 - 1216
[49] Video-realistic image-based eye animation via statistically driven state machines
Weissenfeld, Axel
Liu, Kang
Ostermann, Joern
VISUAL COMPUTER, 2010, 26 (09): : 1201 - 1216
[50] LATENT PERCEPTUAL MAPPING WITH DATA-DRIVEN VARIABLE-LENGTH ACOUSTIC UNITS FOR TEMPLATE-BASED SPEECH RECOGNITION
Sundaram, Shiva
Bellegarda, Jerome R.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4125 - 4128

← 1 2 3 4 5 →