VALID: A new practical audio-visual database, and comparative results

被引：0

作者：

Fox, NA ^{[1
]}

O'Mullane, BA ^{[1
]}

Reilly, RB ^{[1
]}

机构：

[1] Univ Coll Dublin, Dept Elect & Elect Engn, Dublin 4, Ireland

来源：

AUDIO AND VIDEO BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS | 2005年 / 3546卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The performance of deployed audio, face, and multi-modal person recognition systems in non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust audio, face, and multi-modal person recognition systems, the new large and realistic multi-modal (audio-visual) VALID database was acquired in a noisy "real world" office scenario with no control on illumination or acoustic noise. In this paper we describe the acquisition and content of the VALID database, consisting of five recording sessions of 106 subjects over a period of one month. Speaker identification experiments using visual speech features extracted from the mouth region are reported. The performance based on the uncontrolled VALID database is compared with that of the controlled XM2VTS database. The best VALID and XM2VTS based accuracies are 63.21% and 97.17% respectively. This highlights the degrading effect of an uncontrolled illumination environment and the importance of this database for deploying real world applications. The VALID database is available to the academic community through http://ee.ucdie/validdb/.

引用

页码：777 / 786

页数：10

共 50 条

[1] An audio-visual speech recognition with a new mandarin audio-visual database
Liao, Wen-Yuan
Pao, Tsang-Long
Chen, Yu-Te
Chang, Tsun-Wei
INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
[2] Audio-Visual Twins Database
Li, Jing
Zhang, Li
Guo, Dong
Zhuo, Shaojie
Sim, Terence
2015 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2015, : 493 - 500
[3] A New Audio-Visual Database to Represent Urban Path
Qing, Ji
INFORMATION AND BUSINESS INTELLIGENCE, PT I, 2012, 267 : 713 - 719
[4] SUTAV: A Turkish Audio-Visual Database
Topkaya, Ibrahim Saygin
Erdogan, Hakan
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2334 - 2337
[5] A Turkish Audio-Visual Emotional Database
Onder, Onur
Zhalehpour, Sara
Erdem, Cigdem Eroglu
2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
[6] NEW AUDIO-VISUAL STUDENT
MONODCASSIDY, H
MODERN LANGUAGE JOURNAL, 1966, 50 (01): : 15 - 18
[7] NEW AUDIO-VISUAL SYSTEM
不详
EDUCATIONAL TECHNOLOGY, 1967, 7 (16) : 11 - 13
[8] An audio-visual speech recognition system for testing new audio-visual databases
Pao, Tsang-Long
Liao, Wen-Yuan
VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
[9] METHODS AND CHALLENGES FOR CREATING AN EMOTIONAL AUDIO-VISUAL DATABASE
Pandharipande, Meghna A.
Chakraborty, Rupayan
Kopparapu, Sunil Kumar
2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 183 - 188
[10] Development of an audio-visual database system for human identification
Bargale, CB
Chaudhuri, S
Bhattacharyya, P
AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 345 - 352

← 1 2 3 4 5 →