Audio-visual biometric recognition by vector quantization

被引:3
|
作者
Das, Amitava [1 ]
Ghosh, Prasanta [1 ]
机构
[1] Microsoft Res India, Bangalore, Karnataka, India
关键词
D O I
10.1109/SLT.2006.326843
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a Vector Quantization based bimodal (speech and face) biometric recognition method which delivers high performance amidst noise, illumination variations and occlusions (disguised mode) while requiring very little training data, memory storage and complexity of operation. A Transform VQ method delivers good face-recognition performance and a Text Dependent VQ method provides good recognition performance using speech. Simple fusion of two leads to a wider separation between the user-clusters in the combined feature space, leading to high performance.
引用
收藏
页码:166 / +
页数:2
相关论文
共 50 条
  • [21] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
    Hwang, Jung-Wook
    Park, Jeongkyun
    Park, Rae-Hong
    Park, Hyung-Min
    APPLIED ACOUSTICS, 2023, 211
  • [22] Audio-Visual Learning for Multimodal Emotion Recognition
    Fan, Siyu
    Jing, Jianan
    Wang, Chongwen
    SYMMETRY-BASEL, 2025, 17 (03):
  • [23] Audio-Visual Attention Networks for Emotion Recognition
    Lee, Jiyoung
    Kim, Sunok
    Kim, Seungryong
    Sohn, Kwanghoon
    AVSU'18: PROCEEDINGS OF THE 2018 WORKSHOP ON AUDIO-VISUAL SCENE UNDERSTANDING FOR IMMERSIVE MULTIMEDIA, 2018, : 27 - 32
  • [24] A coupled HMM for audio-visual speech recognition
    Nefian, AV
    Liang, LH
    Pi, XB
    Xiaoxiang, L
    Mao, C
    Murphy, K
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 2013 - 2016
  • [25] Scene recognition with audio-visual sensor fusion
    Devicharan, D
    Mehrotra, KG
    Mohan, CK
    Varshney, PK
    Zuo, L
    Multisensor, Multisource Information Fusion: Architectures, Algorithms and Applications 2005, 2005, 5813 : 201 - 210
  • [26] Speaker independent audio-visual speech recognition
    Zhang, Y
    Levinson, S
    Huang, T
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1073 - 1076
  • [27] Multifactor fusion for audio-visual speaker recognition
    Chetty, Girija
    Tran, Dat
    LECTURE NOTES IN SIGNAL SCIENCE, INTERNET AND EDUCATION (SSIP'07/MIV'07/DIWEB'07), 2007, : 70 - +
  • [28] An asynchronous DBN for audio-visual speech recognition
    Saenko, Kate
    Livescu, Karen
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 154 - +
  • [29] Deep operational audio-visual emotion recognition
    Akturk, Kaan
    Keceli, Ali Seydi
    NEUROCOMPUTING, 2024, 588
  • [30] Audio-Visual Emotion Recognition in Video Clips
    Noroozi, Fatemeh
    Marjanovic, Marina
    Njegus, Angelina
    Escalera, Sergio
    Anbarjafari, Gholamreza
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (01) : 60 - 75