Distinctive feature fusion for improved audio-visual phoneme recognition

被引:0
|
作者
Lewis, T [1 ]
Powers, D [1 ]
机构
[1] Flinders Univ S Australia, Sch Informat & Engn, Adelaide, SA 5001, Australia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Auditory and visual signals provide complementary information but few applications successfully combine the two sources. We consider a distinctive feature approach to Audio Visual Automatic Speech Recognition (AV-ASR) in which features appropriate to each modality are employed, and demonstrate that in the absence of knowledge about the noise the modality-specific approach is best. However even information from the non-preferred modality can be usefully employed if the environmental context (e.g. SNR) is accounted for by adaptively weighting each modality. Future research is focusing on deriving these distinctive feature automatically from data rather than using those proposed by linguists.
引用
收藏
页码:62 / 65
页数:4
相关论文
共 50 条
  • [21] Information Fusion Techniques in Audio-Visual Speech Recognition
    Karabalkan, H.
    Erdogan, H.
    2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 734 - 737
  • [22] Improved face and feature finding for audio-visual speech recognition in visually challenging environments
    Jiang, J
    Potamianos, G
    Nock, H
    Iyengar, G
    Neti, C
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: DESIGN AND IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS INDUSTRY TECHNOLOGY TRACKS MACHINE LEARNING FOR SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING SIGNAL PROCESSING FOR EDUCATION, 2004, : 873 - 876
  • [23] Automatic Visual Feature Extraction for Mandarin Audio-Visual Speech Recognition
    Pao, Tsang-Long
    Liao, Wen-Yuan
    Wu, Tsan-Nung
    Lin, Ching-Yi
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 2936 - 2940
  • [24] A HYBRID VISUAL FEATURE EXTRACTION METHOD FOR AUDIO-VISUAL SPEECH RECOGNITION
    Wu, Guanyong
    Zhu, Jie
    Xu, Haihua
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1829 - 1832
  • [25] Audio-Visual Feature Fusion for Vehicles Classification in a Surveillance System
    Wang, Tao
    Zhu, Zhigang
    Hammoud, Riad
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2013, : 381 - 386
  • [26] AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
    Oorloff, Trevine
    Koppisetti, Surya
    Bonettini, Nicole
    Solanki, Divyaraj
    Ben Colman
    Yacoob, Yaser
    Shahriyari, Ali
    Bharaj, Gaurav
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27092 - 27102
  • [27] Information Theoretic Feature Extraction for Audio-Visual Speech Recognition
    Gurban, Mihai
    Thiran, Jean-Philippe
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (12) : 4765 - 4776
  • [28] Empirical Study of Audio-Visual Features Fusion for Gait Recognition
    Castro, Francisco M.
    Marin-Jimenez, Manuel J.
    Guil, Nicolas
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I, 2015, 9256 : 727 - 739
  • [29] AUDIO-VISUAL FUSION AND CONDITIONING WITH NEURAL NETWORKS FOR EVENT RECOGNITION
    Brousmiche, Mathilde
    Rouat, Jean
    Dupont, Stephane
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [30] Continuous audio-visual digit recognition using decision fusion
    Meyer, G
    Mulligan, J
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 305 - 308