Visual features extracting & selecting for lipreading

被引:0
|
作者
Yao, HX [1 ]
Gao, W
Shan, W
Xu, MH
机构
[1] Harbin Inst Technol, Dept Comp Sci & Engn, Harbin 150001, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100080, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper has put forward a way to select and extract visual features effectively for lipreading. These features come from both low-level and high-level, those are compensatory each other. There are 41 dimensional features to be used for recognition. Tested on a bimodal database AVCC which consists of sentences including all Chinese pronunciation, it achieves an accuracy of 87.8% from 84.1% for automatic speech recognition by lipreading assisting. It improves 19.5% accuracy from 31.7% to 51.2% for speakers dependent and improves 27.7% accuracy from 27.6% to 55.3% for speakers independent when speech recognition under noise conditions. And the paper has proves that visual speech information can reinforce the loss of acoustic information effectively by improving recognition rate from 10% to 30% various with the different amount of noises in speech signals in our system, the improving scope is higher than ASR system of IBM. And it performs better in noisy environments.
引用
收藏
页码:251 / 259
页数:9
相关论文
共 50 条
  • [1] Extraction of visual features for lipreading
    Matthews, I
    Cootes, TF
    Bangham, JA
    Cox, S
    Harvey, R
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) : 198 - 213
  • [2] SELECTING RELEVANT VISUAL FEATURES FOR SPEECHREADING
    Estellers, V.
    Gurban, M.
    Thiran, J. P.
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1433 - 1436
  • [3] Gender Recognition Improvement: A new approach for extracting and selecting features
    Maleki, Fateme
    Moghaddam, Marjan Jalali
    Moattar, Mohammad Hossein
    2014 INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK), 2014,
  • [4] Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading
    O'Sullivan, Aisling E.
    Crosse, Michael J.
    Di Liberto, Giovanni M.
    Lalor, Edmund C.
    FRONTIERS IN HUMAN NEUROSCIENCE, 2017, 10
  • [5] An Exemplar-Based Hidden Markov Model with Discriminative Visual Features for Lipreading
    Liu, Xin
    Cheung, Yiu-Ming
    2014 TENTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2014, : 90 - 93
  • [6] Spatiotemporal Convolutional Features for Lipreading
    Palecek, Karel
    TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 438 - 446
  • [7] Extracting and Selecting Distinctive EEG Features for Efficient Epileptic Seizure Prediction
    Wang, Ning
    Lyu, Michael R.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (05) : 1648 - 1659
  • [8] Extracting motion features for visual human activity representation
    Pla, F
    Ribeiro, P
    Santos-Victor, J
    Bernardino, A
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 1, PROCEEDINGS, 2005, 3522 : 537 - 544
  • [9] Extracting multiple news attributes based on visual features
    Liu, Wei
    Yan, Hualiang
    Xiao, Jianguo
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 38 (02) : 465 - 486
  • [10] Extracting multiple news attributes based on visual features
    Wei Liu
    Hualiang Yan
    Jianguo Xiao
    Journal of Intelligent Information Systems, 2012, 38 : 465 - 486