VALID: A new practical audio-visual database, and comparative results

被引：0

作者：

Fox, NA ^{[1
]}

O'Mullane, BA ^{[1
]}

Reilly, RB ^{[1
]}

机构：

[1] Univ Coll Dublin, Dept Elect & Elect Engn, Dublin 4, Ireland

来源：

AUDIO AND VIDEO BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS | 2005年 / 3546卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of deployed audio, face, and multi-modal person recognition systems in non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust audio, face, and multi-modal person recognition systems, the new large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy "real world" office scenario with no control on illumination or acoustic noise. In this paper we describe the acquisition and content of the VALID database, consisting of five recording sessions of 106 subjects over a period of one month. Speaker identification experiments using visual speech features extracted from the mouth region are reported. The performance based on the uncontrolled VALID database is compared with that of the controlled XM2VTS database. The best VALID and XM2VTS based accuracies are 63.21% and 97.17% respectively. This highlights the degrading effect of an uncontrolled illumination environment and the importance of this database for deploying real world applications. The VALID database is available to the academic community through http://ee.ucdie/validdb/.

引用

页码：777 / 786

页数：10

共 50 条

[21] Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli
Sodoyer, D. (sodoyer@icp.inpg.fr), 1600, Hindawi Publishing Corporation (2002):
[22] Separation of Audio-Visual Speech Sources: A New Approach Exploiting the Audio-Visual Coherence of Speech Stimuli
David Sodoyer
Jean-Luc Schwartz
Laurent Girin
Jacob Klinkisch
Christian Jutten
EURASIP Journal on Advances in Signal Processing, 2002
[23] The Saving of Saxony's Audio-visual Heritage: A Practical Report
Eckardt, Andre
BIBLIOTHEK FORSCHUNG UND PRAXIS, 2020, 44 (03) : 339 - 347
[24] Audio-visual perception of new wind parks
Yu, Tianhong
Behm, Holger
Bill, Ralf
Kang, Jian
LANDSCAPE AND URBAN PLANNING, 2017, 165 : 1 - 10
[25] AUDIO-VISUAL EDUCATION
Brickman, William W.
SCHOOL AND SOCIETY, 1948, 67 (1739): : 320 - 326
[26] Audio-Visual Objects
Kubovy M.
Schutz M.
Review of Philosophy and Psychology, 2010, 1 (1) : 41 - 61
[27] Audio-Visual Segmentation
Zhou, Jinxing
Wang, Jianyuan
Zhang, Jiayi
Sun, Weixuan
Zhang, Jing
Birchfield, Stan
Guo, Dan
Kong, Lingpeng
Wang, Meng
Zhong, Yiran
COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 386 - 403
[28] AUDIO-VISUAL CLINICS
GRABER, TM
HANNETT, HA
AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 1963, 49 (07) : 538 - &
[29] Audio-visual imposture
Karam, Walid
Mokbel, Chafic
Greige, Hanna
Chollet, Gerard
MOBILE MULTIMEDIA/IMAGE PROCESSING FOR MILITARY AND SECURITY APPLICATIONS, 2006, 6250
[30] AUDIO-VISUAL TECHNOLOGIES
TAKESHITA, M
FURUKAWA, M
HAYATSU, R
MURAKAMI, R
SUZUKI, K
HASHIZUME, K
NEC RESEARCH & DEVELOPMENT, 1990, (96): : 265 - 277

← 1 2 3 4 5 →