EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION

被引:0
|
作者
Motlicek, Petr [1 ]
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Burget, Lukas [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Brno Univ Technol, Brno, Czech Republic
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
speaker recognition; i-vectors; subspace Gaussian mixture models; automatic speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
引用
收藏
页码:4445 / 4449
页数:5
相关论文
共 50 条
  • [1] Speaker recognition using Gaussian mixture models
    Kamarauskas, J.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2008, (05) : 29 - 32
  • [2] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Povey, Daniel
    Burget, Lukas
    Agarwal, Mohit
    Akyazi, Pinar
    Feng, Kai
    Ghoshal, Arnab
    Glembek, Ondrej
    Goel, Nagendra Kumar
    Karafiat, Martin
    Rastrow, Ariya
    Rose, Richard C.
    Schwarz, Petr
    Thomas, Samuel
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333
  • [3] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
  • [4] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    IET SIGNAL PROCESSING, 2014, 8 (08) : 860 - 867
  • [5] Automatic Speaker Recognition Using Gaussian Mixture Speaker Models
    Reynolds, D. A.
    Lincoln Laboratory Journal, 8 (02):
  • [6] Regularized Subspace Gaussian Mixture Models for Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (07) : 419 - 422
  • [7] Subspace constrained Gaussian mixture models for speech recognition
    Axelrod, S
    Goel, V
    Gopinath, RA
    Olsen, PA
    Visweswariah, K
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1144 - 1160
  • [8] TWO-STAGE SPEAKER ADAPTATION IN SUBSPACE GAUSSIAN MIXTURE MODELS
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Combining Gaussian mixture models and segmental feature models for speaker recognition
    Milosevic, Milana
    Glavitsch, Ulrike
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2042 - 2043
  • [10] Speaker recognition for VoIP transmission using Gaussian mixture models
    Staroniewicz, P
    COMPUTER RECOGNITION SYSTEMS, PROCEEDINGS, 2005, : 739 - 745