A hybrid approach for automatic lip localization and viseme classification to enhance visual speech recognition

被引:0
|
作者
Multimedia Information Systems and Advanced Computing Laboratory, High Institute of Computer Science and Multimedia, University of Sfax, Sfax, Tunisia [1 ]
机构
来源
Integr. Comput. Aided Eng. | 2008年 / 3卷 / 253-266期
关键词
Extraction - Audition - Speech recognition;
D O I
10.3233/ica-2008-15305
中图分类号
学科分类号
摘要
An automatic lip-reading system is among assistive technologies for hearing impaired or elderly people. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple visemes (visual phoneme) pronunciation. A lip-reading system is decomposed into three subsystems: a lip localization subsystem, then a feature extracting subsystem, followed by a classification system that maps feature vectors to visemes. The major difficulty in a lip-reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present, in this paper, a new automatic approach for lip POI localization and feature extraction on a speaker's face based on mouth color information and a geometrical model of the lips. The extracted visual information is then classified in order to recognize the uttered viseme. We have developed our Automatic Lip Feature Extraction prototype (ALiFE). ALiFE prototype is evaluated for multiple speakers under natural conditions. Experiments include a group of French visemes for different speakers. Results revealed that our system recognizes 94.64% of the tested French visemes. © 2008 - IOS Press and the author(s). All rights reserved.
引用
收藏
相关论文
共 50 条
  • [21] Visual speech features representation for automatic lip-reading
    Sagheer, A
    Tsuruta, N
    Taniguchi, RK
    Maeda, S
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 781 - 784
  • [22] Using the visual component in automatic speech recognition
    Brooke, NM
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1656 - 1659
  • [23] Effective Lip Localization and Tracking for Achieving Multimodal Speech Recognition
    Ooi, Wei Chuan
    Jeon, Changwon
    Kim, Kihyeon
    Han, David K.
    Ko, Hanseok
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 649 - 652
  • [24] Application of automatic speech recognition in call classification
    Das, SS
    Chan, N
    Wages, D
    Hansen, JHL
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3896 - 3899
  • [25] UTTERANCE CLASSIFICATION CONFIDENCE IN AUTOMATIC SPEECH RECOGNITION
    KIMBALL, R
    ROTHKOPF, MH
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 188 - 189
  • [26] A STATISTICAL APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
    SMITH, JEK
    KLEM, L
    AMERICAN PSYCHOLOGIST, 1961, 16 (07) : 445 - 445
  • [27] A LIP GEOMETRY APPROACH FOR FEATURE-FUSION BASED AUDIO-VISUAL SPEECH RECOGNITION
    Ibrahim, M. Z.
    Mulvaney, D. J.
    2014 6TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING (ISCCSP), 2014, : 644 - 647
  • [28] Hybrid neuromorphic system for automatic speech recognition
    Rafique, M. A.
    Lee, B. G.
    Jeon, M.
    ELECTRONICS LETTERS, 2016, 52 (17) : 1428 - 1429
  • [29] Incremental Hybrid Approach for Unsupervised Classification: Applications to Visual Landmarks Recognition
    Bandera, Antonio
    Marfil, Rebeca
    IMAGE ANALYSIS AND RECOGNITION, PT I, PROCEEDINGS, 2010, 6111 : 137 - 146
  • [30] Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition
    Schuster, Maria
    Maier, Andreas
    Haderlein, Tino
    Nkenke, Emeka
    Wohlleben, Ulrike
    Rosanowski, Frank
    Eysholdt, Ulrich
    Noeth, Elmar
    INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2006, 70 (10) : 1741 - 1747