Affine-invariant visual features contain supplementary information to enhance speech recognition

被引:0
|
作者
Gurbuz, S [1 ]
Patterson, E [1 ]
Tufekci, Z [1 ]
Gowdy, JN [1 ]
机构
[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the supplementary information that is available from video features.
引用
收藏
页码:175 / 181
页数:7
相关论文
共 50 条
  • [1] Application of affine-invariant Fourier descriptors to lipreading for audio-visual speech recognition
    Gurbuz, S
    Tufekci, Z
    Patterson, E
    Gowdy, JN
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 177 - 180
  • [2] AFFINE INVARIANT FEATURES AND THEIR APPLICATION TO SPEECH RECOGNITION
    Qiao, Yu
    Suzuki, Masayuki
    Minematsu, Nobuaki
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4629 - 4632
  • [3] Affine-invariant objects recognition method employing features in frequency domain
    National Key Laboratory of Integrated Information System Technology, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
    不详
    Jisuanji Yanjiu yu Fazhan, 2009, 3 (478-484): : 478 - 484
  • [4] Mandarin Tone Recognition using Affine-Invariant Prosodic Features and Tone Posteriorgram
    Wang, Yow-Bang
    Lee, Lin-Shan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2854 - 2857
  • [5] New features for affine-invariant shape classification
    Dionisio, CRP
    Kim, HY
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2135 - 2138
  • [6] Affine-invariant character recognition by progressive removing
    Iwamura, Masakazu
    Horimatsu, Akira
    Niwa, Ryo
    Kise, Koichi
    Uchida, Seiichi
    Omachi, Shinichiro
    ELECTRICAL ENGINEERING IN JAPAN, 2012, 180 (02) : 55 - 63
  • [7] Pictorial recognition using affine-invariant spectral signatures
    BenArie, J
    Wang, ZQ
    1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, : 34 - 39
  • [8] A novel algorithm using affine-invariant features for pose-variant face recognition
    Zhao, Youen
    Li, Li
    Liu, Zhaoguang
    COMPUTERS & ELECTRICAL ENGINEERING, 2015, 46 : 217 - 230
  • [9] Affine-invariant shape recognition using Grassmann manifold
    Liu, Yun-Peng
    Li, Guang-Wei
    Shi, Ze-Lin
    Zidonghua Xuebao/Acta Automatica Sinica, 2012, 38 (02): : 248 - 258
  • [10] Iconic representation and recognition using Affine-Invariant Spectral Signatures
    BenArie, J
    Wang, ZQ
    Rao, KR
    IMAGE UNDERSTANDING WORKSHOP, 1996 PROCEEDINGS, VOLS I AND II, 1996, : 1277 - 1285