Audio-visual biometric based speaker identification

被引：1

作者：

Kar, Biswajit ^{[1
]}

Bhatia, Sandeep ^{[1
]}

Dutta, P. K. ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Elect Engn, Kharagpur 721302, W Bengal, India

来源：

ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL IV, PROCEEDINGS | 2007年

关键词：

biometrics; speaker recognition; speaker model; audio visual speech recognition;

D O I：

10.1109/ICCIMA.2007.21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we present a multimodal audio-visual speaker identification system. The proposed system decomposes the information existing in a video stream into two components: speech and lip motion. It has been studied that lip information not only presents speech information but also characteristic information about a person's identity. Fusing this information with speech information will produce robust person identification tinder adverse condition. Gaussian mixture models (GMMs) and Hidden markov models (HMMs) are used throughout this work for the tasks of text dependent speaker recognition and month tracking. The performance is evaluated for dataset of 22 Indian of different ethnicity speakers each tittering a sentence. The results show that the performance of the biometric system is significantly better when both audio and video features are used.

引用

页码：94 / 98

页数：5

共 50 条

[1] Audio-visual speaker identification based on the use of dynamic audio and visual features
Fox, N
Reilly, RB
AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 743 - 751
[2] A Bayesian approach to audio-visual speaker identification
Nefian, AV
Liang, LH
Fu, TY
Liu, XX
AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 761 - 769
[3] ENVIRONMENTALLY ROBUST AUDIO-VISUAL SPEAKER IDENTIFICATION
Schoenherr, Lea
Orth, Dennis
Heckmann, Martin
Kolossa, Dorothea
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 312 - 318
[4] Audio-Visual Feature Fusion for Speaker Identification
Almaadeed, Noor
Aggoun, Amar
Amira, Abbes
NEURAL INFORMATION PROCESSING, ICONIP 2012, PT I, 2012, 7663 : 56 - 67
[5] A Visual Signal Reliability for Robust Audio-Visual Speaker Identification
Tariquzzaman, Md.
Kim, Jin Young
Na, Seung You
Kim, Hyoung-Gook
Har, Dongsoo
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (10): : 2052 - 2055
[6] Audio-visual speaker identification with asynchronous articulatory feature
Chen, Yanxiang
Liu, M.
ELECTRONICS LETTERS, 2010, 46 (03) : 242 - U77
[7] Fuzzy audio-visual feature maps for speaker identification
Chibelushi, CC
APPLICATIONS AND SCIENCE IN SOFT COMPUTING, 2004, : 317 - 322
[8] A confidence-based late fusion framework for audio-visual biometric identification
Alam, Mohammad Rafiqul
Bennamoun, Mohammed
Togneri, Roberto
Sohel, Ferdous
PATTERN RECOGNITION LETTERS, 2015, 52 : 65 - 71
[9] Audio-Visual Synchronisation for Speaker Diarisation
Garau, Giulia
Dielmann, Alfred
Bourlard, Herve
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2662 - +
[10] Audio-visual speaker identification using coupled hidden markov models
Fu, T
Liu, XX
Liang, LH
Pi, XB
Nefian, AV
2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 3, PROCEEDINGS, 2003, : 29 - 32

← 1 2 3 4 5 →