VALID: A new practical audio-visual database, and comparative results

被引:0
|
作者
Fox, NA [1 ]
O'Mullane, BA [1 ]
Reilly, RB [1 ]
机构
[1] Univ Coll Dublin, Dept Elect & Elect Engn, Dublin 4, Ireland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of deployed audio, face, and multi-modal person recognition systems in non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust audio, face, and multi-modal person recognition systems, the new large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy "real world" office scenario with no control on illumination or acoustic noise. In this paper we describe the acquisition and content of the VALID database, consisting of five recording sessions of 106 subjects over a period of one month. Speaker identification experiments using visual speech features extracted from the mouth region are reported. The performance based on the uncontrolled VALID database is compared with that of the controlled XM2VTS database. The best VALID and XM2VTS based accuracies are 63.21% and 97.17% respectively. This highlights the degrading effect of an uncontrolled illumination environment and the importance of this database for deploying real world applications. The VALID database is available to the academic community through http://ee.ucdie/validdb/.
引用
收藏
页码:777 / 786
页数:10
相关论文
共 50 条
  • [31] Audio-Visual Techniques
    Sears, William P., Jr.
    EDUCATION, 1948, 69 (02): : 132 - 132
  • [32] AUDIO-VISUAL UNIT
    WHARTON, BA
    PEDIATRICS, 1971, 47 (05) : 957 - &
  • [33] AUDIO-VISUAL POTPOURRI
    不详
    INDUSTRIAL PHOTOGRAPHY, 1968, 17 (07): : 30 - &
  • [34] A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities
    Wong, Yee Wan
    Ch'ng, Sue Inn
    Seng, Kah Phooi
    Ang, Li-Minn
    Chin, Slew Wen
    Chew, Wei Jen
    Lim, King Hann
    PATTERN RECOGNITION LETTERS, 2011, 32 (13) : 1503 - 1510
  • [35] AUDIO-VISUAL DEVELOPMENTS
    Schwartz, Mortimer
    JOURNAL OF LEGAL EDUCATION, 1952, 5 (01) : 88 - 95
  • [36] Audio-visual biometrics
    Aleksic, Petar S.
    Katsaggelos, Aggelos K.
    PROCEEDINGS OF THE IEEE, 2006, 94 (11) : 2025 - 2044
  • [37] AUDIO-VISUAL FOR THE PATIENT
    STUTTLE, FL
    JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 1959, 41 (07): : 1362 - 1362
  • [38] The Audio-Visual Reader
    不详
    JOURNAL OF EDUCATIONAL RESEARCH, 1955, 48 (07): : 552 - 553
  • [39] Perceptual thresholds of audio-visual spatial coherence for a variety of audio-visual objects
    Stenzel, Hanne
    Jackson, Philip J. B.
    2018 AES INTERNATIONAL CONFERENCE ON AUDIO FOR VIRTUAL AND AUGMENTED REALITY, 2018,
  • [40] Transfer of Audio-Visual Temporal Training to Temporal and Spatial Audio-Visual Tasks
    Suerig, Ralf
    Bottari, Davide
    Roeder, Brigitte
    MULTISENSORY RESEARCH, 2018, 31 (06) : 556 - 578