Affine-invariant visual features contain supplementary information to enhance speech recognition

被引：0

作者：

Gurbuz, S ^{[1
]}

Patterson, E ^{[1
]}

Tufekci, Z ^{[1
]}

Gowdy, JN ^{[1
]}

机构：

[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA

来源：

AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS | 2001年 / 2091卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the supplementary information that is available from video features.

引用

页码：175 / 181

页数：7

共 50 条

[41] Dynamic visual features based on discriminative speech class projection for visual speech recognition
Lei, X
Cai, XL
Fu, ZH
Zhao, RC
PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 687 - 690
[42] Traffic signs recognition based on affine invariant Hu's moment features
Liu, Min
Mao, Jianxu
MECHATRONICS AND INDUSTRIAL INFORMATICS, PTS 1-4, 2013, 321-324 : 945 - 949
[43] DEEP COMPLEMENTARY BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION
Petridis, Stavros
Pantic, Maja
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2304 - 2308
[44] Extraction of affine invariant features for shape recognition based on ant colony optimization
Mao, Yuxing
Suen, Ching Y.
He, Wei
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2011, 22 (06) : 1003 - 1009
[45] EXTRACTING DEEP BOTTLENECK FEATURES FOR VISUAL SPEECH RECOGNITION
Sui, Chao
Togneri, Roberto
Bennamoun, Mohammed
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1518 - 1522
[46] VISUAL FEATURES FOR CONTEXT-AWARE SPEECH RECOGNITION
Gupta, Abhinav
Miao, Yajie
Neves, Leonardo
Metze, Florian
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5020 - 5024
[47] Extraction of affine invariant features for shape recognition based on ant colony optimization
Yuxing Mao 1
2.Centre for Pattern Recognition and Machine Intelligence
Journal of Systems Engineering and Electronics, 2011, 22 (06) : 1003 - 1009
[48] Affine invariant information embedment for accurate camera-based character recognition
Omachi, Shinichiro
Uchida, Seiichi
Iwamura, Masakazu
Kise, Koichi
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 1098 - +
[49] The effect of prior visual information on recognition of speech and sounds
Noppeney, Uta
Josephs, Oliver
Hocking, Julia
Price, Cathy J.
Friston, Karl J.
CEREBRAL CORTEX, 2008, 18 (03) : 598 - 609
[50] Rate-Invariant Comparisons of Covariance Paths for Visual Speech Recognition
Su, Jingyong
Srivastava, Anuj
Souza, Fillipe
Sarkar, Sudeep
2013 FOURTH NATIONAL CONFERENCE ON COMPUTER VISION, PATTERN RECOGNITION, IMAGE PROCESSING AND GRAPHICS (NCVPRIPG), 2013,

← 1 2 3 4 5 →