Visual features extracting & selecting for lipreading

被引:0
|
作者
Yao, HX [1 ]
Gao, W
Shan, W
Xu, MH
机构
[1] Harbin Inst Technol, Dept Comp Sci & Engn, Harbin 150001, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100080, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper has put forward a way to select and extract visual features effectively for lipreading. These features come from both low-level and high-level, those are compensatory each other. There are 41 dimensional features to be used for recognition. Tested on a bimodal database AVCC which consists of sentences including all Chinese pronunciation, it achieves an accuracy of 87.8% from 84.1% for automatic speech recognition by lipreading assisting. It improves 19.5% accuracy from 31.7% to 51.2% for speakers dependent and improves 27.7% accuracy from 27.6% to 55.3% for speakers independent when speech recognition under noise conditions. And the paper has proves that visual speech information can reinforce the loss of acoustic information effectively by improving recognition rate from 10% to 30% various with the different amount of noises in speech signals in our system, the improving scope is higher than ASR system of IBM. And it performs better in noisy environments.
引用
收藏
页码:251 / 259
页数:9
相关论文
共 50 条
  • [11] EXTRACTING DEEP BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION
    Sui, Chao
    Togneri, Roberto
    Bennamoun, Mohammed
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1518 - 1522
  • [12] EXTRACTING CAUSAL VISUAL FEATURES FOR LIMITED LABEL CLASSIFICATION
    Prabhushankar, Mohit
    AlRegib, Ghassan
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3697 - 3701
  • [13] Extracting dense features for visual correspondence with graph cuts
    Veksler, O
    2003 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2003, : 689 - 694
  • [14] Survey of lipreading - One of visual languages
    Yao, H.X.
    Gao, W.
    Wang, R.
    Lang, X.B.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2001, 29 (02): : 239 - 246
  • [15] Cross-Attention Fusion of Visual and Geometric Features for Large-Vocabulary Arabic Lipreading
    Daou, Samar
    Ben-Hamadou, Achraf
    Rekik, Ahmed
    Kallel, Abdelaziz
    TECHNOLOGIES, 2025, 13 (01)
  • [16] RELATIONSHIP OF VISUAL SYNTHESIS SKILL TO LIPREADING
    SANDERS, JW
    COSCARELLI, JE
    AMERICAN ANNALS OF THE DEAF, 1970, 115 (01) : 23 - 26
  • [17] Lipreading Based on Multiple Visual Attention
    Xie Y.
    Xue F.
    Cao M.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (01): : 73 - 84
  • [18] Robin: Extracting visual and textual features from web pages
    Oka, M
    Tsukada, H
    Kato, K
    FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS, 2006, 3841 : 765 - 771
  • [19] Extracting and Selecting Robust Radiomic Features from PET/MR Images in Nasopharyngeal Carcinoma
    Pengfei Yang
    Lei Xu
    Zuozhen Cao
    Yidong Wan
    Yi Xue
    Yangkang Jiang
    Eric Yen
    Chen Luo
    Jing Wang
    Yi Rong
    Tianye Niu
    Molecular Imaging and Biology, 2020, 22 : 1581 - 1591
  • [20] Extracting and Selecting Robust Radiomic Features from PET/MR Images in Nasopharyngeal Carcinoma
    Yang, Pengfei
    Xu, Lei
    Cao, Zuozhen
    Wan, Yidong
    Xue, Yi
    Jiang, Yangkang
    Yen, Eric
    Luo, Chen
    Wang, Jing
    Rong, Yi
    Niu, Tianye
    MOLECULAR IMAGING AND BIOLOGY, 2020, 22 (06) : 1581 - 1591