Affine-invariant visual features contain supplementary information to enhance speech recognition

被引:0
|
作者
Gurbuz, S [1 ]
Patterson, E [1 ]
Tufekci, Z [1 ]
Gowdy, JN [1 ]
机构
[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the supplementary information that is available from video features.
引用
收藏
页码:175 / 181
页数:7
相关论文
共 50 条
  • [21] Novel affine-invariant curve descriptor for curve matching and occluded object recognition
    Fu, Huijing
    Tian, Zheng
    Ran, Maohua
    Fan, Ming
    IET COMPUTER VISION, 2013, 7 (04) : 279 - 292
  • [22] Black-box attacks on face recognition via affine-invariant training
    Bowen Sun
    Hang Su
    Shibao Zheng
    Neural Computing and Applications, 2024, 36 : 8549 - 8564
  • [23] Aircraft recognition based on affine invariant features
    Xu, Shugong
    Sang, Nong
    Zhang, Jiansen
    Huang, Zailu
    Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 1995, 23 (10):
  • [24] Independent information from visual features for multimodal speech recognition
    Gurbuz, S
    Tufekci, Z
    Patterson, E
    Gowdy, JN
    IEEE SOUTHEASTCON 2001: ENGINEERING THE FUTURE, PROCEEDINGS, 2001, : 221 - 228
  • [25] New area matrix-based affine-invariant shape features and similarity metrics
    Dionisio, Carlos R. R.
    Kim, Hae Yong
    2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 1725 - +
  • [26] Affine Invariant Visual Phrases for Object Instance Recognition
    Patraucean, Viorica
    Ovsjanikov, Maks
    2015 14TH IAPR INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA), 2015, : 14 - 17
  • [27] Dynamic Affine-Invariant Shape-Appearance Handshape Features and Classification in Sign Language Videos
    Roussos, Anastasios
    Theodorakis, Stavros
    Pitsikalis, Vassilis
    Maragos, Petros
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 1627 - 1663
  • [28] Robust speech recognition by extracting invariant features
    Eskikand, Parvin Zarei
    Seyyedsalehi, Seyyed Ali
    4TH INTERNATIONAL CONFERENCE OF COGNITIVE SCIENCE, 2012, 32 : 230 - 237
  • [29] Gaussian kernels for affine-invariant iconic representation and object recognition by multi-dimensional indexing
    BenArie, J
    Wang, ZQ
    Rao, KR
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '96, 1996, 2727 : 156 - 167
  • [30] Visual speech information for face recognition
    Lawrence D. Rosenblum
    Deborah A. Yakel
    Naser Baseer
    Anjani Panchal
    Brynn C. Nodarse
    Ryan P. Niehus
    Perception & Psychophysics, 2002, 64 : 220 - 229