EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION

被引：0

作者：

Motlicek, Petr ^{[1
]}

Dey, Subhadeep ^{[1
,2
]}

Madikeri, Srikanth ^{[1
]}

Burget, Lukas ^{[3
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

[3] Brno Univ Technol, Brno, Czech Republic

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

speaker recognition; i-vectors; subspace Gaussian mixture models; automatic speech recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.

引用

页码：4445 / 4449

页数：5

共 50 条

[1] Speaker recognition using Gaussian mixture models
Kamarauskas, J.
ELEKTRONIKA IR ELEKTROTECHNIKA, 2008, (05) : 29 - 32
[2] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
Povey, Daniel
Burget, Lukas
Agarwal, Mohit
Akyazi, Pinar
Feng, Kai
Ghoshal, Arnab
Glembek, Ondrej
Goel, Nagendra Kumar
Karafiat, Martin
Rastrow, Ariya
Rose, Richard C.
Schwarz, Petr
Thomas, Samuel
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333
[3] Skew Gaussian mixture models for speaker recognition
Matza, Avi
Bistritz, Yuval
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
[4] Skew Gaussian mixture models for speaker recognition
Matza, Avi
Bistritz, Yuval
IET SIGNAL PROCESSING, 2014, 8 (08) : 860 - 867
[5] Automatic Speaker Recognition Using Gaussian Mixture Speaker Models
Reynolds, D. A.
Lincoln Laboratory Journal, 8 (02):
[6] Regularized Subspace Gaussian Mixture Models for Speech Recognition
Lu, Liang
Ghoshal, Arnab
Renals, Steve
IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (07) : 419 - 422
[7] Subspace constrained Gaussian mixture models for speech recognition
Axelrod, S
Goel, V
Gopinath, RA
Olsen, PA
Visweswariah, K
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1144 - 1160
[8] TWO-STAGE SPEAKER ADAPTATION IN SUBSPACE GAUSSIAN MIXTURE MODELS
Ghalehjegh, Sina Hamidi
Rose, Richard C.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Combining Gaussian mixture models and segmental feature models for speaker recognition
Milosevic, Milana
Glavitsch, Ulrike
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2042 - 2043
[10] Speaker recognition for VoIP transmission using Gaussian mixture models
Staroniewicz, P
COMPUTER RECOGNITION SYSTEMS, PROCEEDINGS, 2005, : 739 - 745

← 1 2 3 4 5 →