EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION

被引:0
|
作者
Motlicek, Petr [1 ]
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Burget, Lukas [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Brno Univ Technol, Brno, Czech Republic
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
speaker recognition; i-vectors; subspace Gaussian mixture models; automatic speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
引用
收藏
页码:4445 / 4449
页数:5
相关论文
共 50 条
  • [31] Noise adaptive training for subspace Gaussian mixture models
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3459 - 3463
  • [32] Speaker verification using adapted Gaussian mixture models
    Reynolds, DA
    Quatieri, TF
    Dunn, RB
    DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 19 - 41
  • [33] Accent adaptation using Subspace Gaussian Mixture Models
    Motlicek, Petr
    Garner, Philip N.
    Kim, Namhoon
    Cho, Jeongmi
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2013, : 7170 - 7174
  • [34] Large Margin Gaussian mixture models for speaker identification
    Jourani, Reda
    Daoudi, Khalid
    Andre-Obrecht, Regine
    Aboutajdine, Driss
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1441 - +
  • [35] SOFT FRAME MARGIN ESTIMATION OF GAUSSIAN MIXTURE MODELS FOR SPEAKER RECOGNITION WITH SPARSE TRAINING DATA
    Yin, Yan
    Li, Qi
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5268 - 5271
  • [36] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [37] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Memon, Sheeraz
    Bhatti, Sania
    Abro, Farzana Rauf
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
  • [38] MAXIMUM A POSTERIORI ADAPTATION OF SUBSPACE GAUSSIAN MIXTURE MODELS FOR CROSS-LINGUAL SPEECH RECOGNITION
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4877 - 4880
  • [39] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [40] Robust Text-independent Speaker recognition with Short Utterances using Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 2204 - 2209