EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION

被引:0
|
作者
Motlicek, Petr [1 ]
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Burget, Lukas [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Brno Univ Technol, Brno, Czech Republic
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
speaker recognition; i-vectors; subspace Gaussian mixture models; automatic speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
引用
收藏
页码:4445 / 4449
页数:5
相关论文
共 50 条
  • [21] Noise Compensation for Subspace Gaussian Mixture Models
    Lu, Liang
    Chin, K. K.
    Ghoshal, Arnab
    Renals, Steve
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 306 - 309
  • [22] Initializing Subspace Constrained Gaussian Mixture Models
    Olsen, PA
    Visweswariah, K
    Gopinath, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 661 - 664
  • [23] Score calibrating for speaker recognition based on support vector machines and Gaussian Mixture Models
    Katz, Marcel
    Schaffoener, Martin
    Krueger, Sven E.
    Wendemuth, Andreas
    PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2007, : 146 - 151
  • [24] DEALING WITH ACOUSTIC MISMATCH FOR TRAINING MULTILINGUAL SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Mohan, Aanchan
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4893 - 4896
  • [25] Automatic speaker recognition using a unique personal feature vector and Gaussian Mixture Models
    Kaminski, Kamil
    Majda, Ewelina
    Dobrowolski, Andrzej P.
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 220 - 225
  • [26] Telephone based speaker recognition using multiple binary classifier and Gaussian Mixture Models
    Castellano, PJ
    Slomka, S
    Sridharan, S
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1075 - 1078
  • [27] Improved Approach for Calculating Model Parameters in Speaker Recognition using Gaussian Mixture Models
    Metkar, Prashant
    Cohen, Aaron
    Parhi, Keshab
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 567 - 570
  • [28] Improved Gaussian Mixture Model and Application in Speaker Recognition
    Bao Lingling
    Shen Xizhong
    PROCEEDINGS OF 2016 THE 2ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS, 2016, : 387 - 390
  • [29] A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition
    Li, Bo
    Sim, Khe Chai
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1770 - 1773
  • [30] ACCENT ADAPTATION USING SUBSPACE GAUSSIAN MIXTURE MODELS
    Motlicek, Petr
    Garner, Philip N.
    Kim, Namhoon
    Cho, Jeongmi
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7170 - 7174