A hybrid approach for automatic lip localization and viseme classification to enhance visual speech recognition

被引:0
|
作者
Multimedia Information Systems and Advanced Computing Laboratory, High Institute of Computer Science and Multimedia, University of Sfax, Sfax, Tunisia [1 ]
机构
来源
Integr. Comput. Aided Eng. | 2008年 / 3卷 / 253-266期
关键词
Extraction - Audition - Speech recognition;
D O I
10.3233/ica-2008-15305
中图分类号
学科分类号
摘要
An automatic lip-reading system is among assistive technologies for hearing impaired or elderly people. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple visemes (visual phoneme) pronunciation. A lip-reading system is decomposed into three subsystems: a lip localization subsystem, then a feature extracting subsystem, followed by a classification system that maps feature vectors to visemes. The major difficulty in a lip-reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present, in this paper, a new automatic approach for lip POI localization and feature extraction on a speaker's face based on mouth color information and a geometrical model of the lips. The extracted visual information is then classified in order to recognize the uttered viseme. We have developed our Automatic Lip Feature Extraction prototype (ALiFE). ALiFE prototype is evaluated for multiple speakers under natural conditions. Experiments include a group of French visemes for different speakers. Results revealed that our system recognizes 94.64% of the tested French visemes. © 2008 - IOS Press and the author(s). All rights reserved.
引用
收藏
相关论文
共 50 条
  • [1] A hybrid approach for automatic lip localization and viseme classification to enhance visual speech recognition
    Mahdi, Walid
    Werda, Salah
    Ben Hamadou, Abdelmajid
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2008, 15 (03) : 253 - 266
  • [2] Automatic Viseme Vocabulary Construction to Enhance Continuous Lip-reading
    Fernandez-Lopez, Adriana
    Sukno, Federico M.
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 5, 2017, : 52 - 63
  • [3] Lip temporal pattern analysis for automatic visual speech recognition
    Xie, L
    Cai, XL
    Fu, ZH
    Jiang, DM
    Zhao, RC
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 703 - 706
  • [4] A Phone-Viseme Dynamic Bayesian Network for Audio-Visual Automatic Speech Recognition
    Terry, Louis
    Katsaggelos, Aggelos K.
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2597 - 2600
  • [5] VISEME DEFINITIONS COMPARISON FOR VISUAL-ONLY SPEECH RECOGNITION
    Cappelletta, Luca
    Harte, Naomi
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2109 - 2118
  • [6] Persian Viseme Classification for Developing Visual Speech Training Application
    Bastanfard, Azam
    Aghaahmadi, Mohammad
    Kelishami, Alireza Abdi
    Fazel, Maryam
    Moghadam, Maedeh
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2009, 2009, 5879 : 1080 - +
  • [7] Lip Localization Technique Towards an Automatic Lip Reading Approach for Myanmar Consonants Recognition
    Thein, Thein
    San, Kalyar Myo
    CONFERENCE PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTER TECHNOLOGIES (ICICT), 2018, : 123 - 127
  • [8] A hybrid approach to improving automatic speech recognition via NLP
    Voll, Kimberly
    Advances in Artificial Intelligence, 2007, 4509 : 514 - 525
  • [9] A Hybrid HMM/ANN Approach for Automatic Gujarati Speech Recognition
    Valaki, Sanjay
    Jethva, Harikrishna
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [10] Investigating a Hybrid Learning Approach for Robust Automatic Speech Recognition
    Pironkov, Gueorgui
    Wood, Sean U. N.
    Dupont, Stephane
    Dutoit, Thierry
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 67 - 78