Visual-speech-pass filtering for robust automatic lip-reading

被引:0
|
作者
Jong-Seok Lee
机构
[1] Yonsei University,School of Integrated Technology
来源
关键词
Automatic lip-reading; Visual-speech-pass filtering (VSPF); Feature extraction; Temporal filtering; Noise-robustness;
D O I
暂无
中图分类号
学科分类号
摘要
This paper proposes a temporal filtering technique used in extraction of visual features for improved robustness of automatic lip-reading, called visual-speech-pass filtering. A band-pass filter is applied to the pixel value sequence of the images containing the speaker’s lip region to remove unwanted variations that are not relevant to the speech information. The filter is carefully designed based on psychological, spectral, and experimental analyses. Experimental results on two speaker-independent and one speaker-dependent recognition tasks demonstrate that the proposed technique significantly improves recognition performance in both clean and visually noisy conditions.
引用
收藏
页码:611 / 621
页数:10
相关论文
共 50 条
  • [21] MODULAR BDPCA BASED VISUAL FEATURE REPRESENTATION FOR LIP-READING
    Wu, Guanyong
    Zhu, Jie
    2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5, 2008, : 1328 - 1331
  • [22] AI-based visual speech recognition towards realistic avatars and lip-reading applications in the metaverse
    Li, Ying
    Hashim, Ahmad Sobri
    Lin, Yun
    Nohuddin, Puteri N. E.
    Venkatachalam, K.
    Ahmadian, Ali
    APPLIED SOFT COMPUTING, 2024, 164
  • [23] Integrating Lip-Reading and Thai Speech to Control Electronic Devices in a Vehicle
    Masamae, Isamail
    Chaikan, Panyayot
    2015 5TH IEEE INTERNATIONAL CONFERENCE ON SYSTEM ENGINEERING AND TECHNOLOGY (ICSET), 2015, : 29 - 32
  • [24] FRENCH LIP-READING AND CUED-SPEECH TRAINING BY INTERACTIVE VIDEO
    VANDENBEMDEN, G
    DUFOUR, P
    MARCO, C
    JOURNAL OF MICROCOMPUTER APPLICATIONS, 1990, 13 (02): : 193 - 200
  • [25] Multi-view Automatic Lip-Reading Using Neural Network
    Lee, Daehyun
    Lee, Jongmin
    Kim, Kee-Eung
    COMPUTER VISION - ACCV 2016 WORKSHOPS, PT II, 2017, 10117 : 290 - 302
  • [26] The neural basis of lip-reading capabilities is altered by early visual deprivation
    Putzar, Lisa
    Goerendt, Ines
    Heed, Tobias
    Richard, Gisbert
    Buechel, Christian
    Roeder, Brigitte
    NEUROPSYCHOLOGIA, 2010, 48 (07) : 2158 - 2166
  • [27] Improving the Performance of Automatic Lip-Reading Using Image Conversion Techniques
    Lee, Ki-Seung
    ELECTRONICS, 2024, 13 (06)
  • [28] A PCA based visual DCT feature extraction method for lip-reading
    Hong, Xiaopeng
    Yao, Hongxun
    Wan, Yuqi
    Chen, Rong
    IIH-MSP: 2006 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, PROCEEDINGS, 2006, : 321 - +
  • [29] Improved skin/lip-color modeling method for visual-only lip-reading
    Wang, Xiaoping
    Fu, Degang
    Yuan, Chunwei
    PROGRESS ON POST-GENOME TECHNOLOGIES, 2007, : 176 - 178
  • [30] Mobile Device-based Speech Enhancement System Using Lip-reading
    Matsunaga, Yuta
    Matsui, Kenji
    2018 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN ENGINEERING AND TECHNOLOGY (IICAIET), 2018, : 13 - 16