A hybrid approach for automatic lip localization and viseme classification to enhance visual speech recognition

被引：0

作者：

Multimedia Information Systems and Advanced Computing Laboratory, High Institute of Computer Science and Multimedia, University of Sfax, Sfax, Tunisia ^{[1
]}

机构：

来源：

Integr. Comput. Aided Eng. | 2008年 / 3卷 / 253-266期

关键词：

Extraction - Audition - Speech recognition;

D O I：

10.3233/ica-2008-15305

中图分类号：

学科分类号：

摘要：

An automatic lip-reading system is among assistive technologies for hearing impaired or elderly people. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple visemes (visual phoneme) pronunciation. A lip-reading system is decomposed into three subsystems: a lip localization subsystem, then a feature extracting subsystem, followed by a classification system that maps feature vectors to visemes. The major difficulty in a lip-reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present, in this paper, a new automatic approach for lip POI localization and feature extraction on a speaker's face based on mouth color information and a geometrical model of the lips. The extracted visual information is then classified in order to recognize the uttered viseme. We have developed our Automatic Lip Feature Extraction prototype (ALiFE). ALiFE prototype is evaluated for multiple speakers under natural conditions. Experiments include a group of French visemes for different speakers. Results revealed that our system recognizes 94.64% of the tested French visemes. © 2008 - IOS Press and the author(s). All rights reserved.

引用

共 50 条

[21] Visual speech features representation for automatic lip-reading
Sagheer, A
Tsuruta, N
Taniguchi, RK
Maeda, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 781 - 784
[22] Using the visual component in automatic speech recognition
Brooke, NM
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1656 - 1659
[23] Effective Lip Localization and Tracking for Achieving Multimodal Speech Recognition
Ooi, Wei Chuan
Jeon, Changwon
Kim, Kihyeon
Han, David K.
Ko, Hanseok
2008 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 649 - 652
[24] Application of automatic speech recognition in call classification
Das, SS
Chan, N
Wages, D
Hansen, JHL
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3896 - 3899
[25] UTTERANCE CLASSIFICATION CONFIDENCE IN AUTOMATIC SPEECH RECOGNITION
KIMBALL, R
ROTHKOPF, MH
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 188 - 189
[26] A STATISTICAL APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
SMITH, JEK
KLEM, L
AMERICAN PSYCHOLOGIST, 1961, 16 (07) : 445 - 445
[27] A LIP GEOMETRY APPROACH FOR FEATURE-FUSION BASED AUDIO-VISUAL SPEECH RECOGNITION
Ibrahim, M. Z.
Mulvaney, D. J.
2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 644 - 647
[28] Hybrid neuromorphic system for automatic speech recognition
Rafique, M. A.
Lee, B. G.
Jeon, M.
ELECTRONICS LETTERS, 2016, 52 (17) : 1428 - 1429
[29] Incremental Hybrid Approach for Unsupervised Classification: Applications to Visual Landmarks Recognition
Bandera, Antonio
Marfil, Rebeca
IMAGE ANALYSIS AND RECOGNITION, PT I, PROCEEDINGS, 2010, 6111 : 137 - 146
[30] Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition
Schuster, Maria
Maier, Andreas
Haderlein, Tino
Nkenke, Emeka
Wohlleben, Ulrike
Rosanowski, Frank
Eysholdt, Ulrich
Noeth, Elmar
INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2006, 70 (10) : 1741 - 1747

← 1 2 3 4 5 →