EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION

被引:0
|
作者
Motlicek, Petr [1 ]
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Burget, Lukas [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Brno Univ Technol, Brno, Czech Republic
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
speaker recognition; i-vectors; subspace Gaussian mixture models; automatic speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents Subspace Gaussian Mixture Model (SGMM) approach employed as a probabilistic generative model to estimate speaker vector representations to be subsequently used in the speaker verification task. SGMMs have already been shown to significantly outperform traditional HMM/GMMs in Automatic Speech Recognition (ASR) applications. An extension to the basic SGMM framework allows to robustly estimate low-dimensional speaker vectors and exploit them for speaker adaptation. We propose a speaker verification framework based on low-dimensional speaker vectors estimated using SGMMs, trained in ASR manner using manual transcriptions. To test the robustness of the system, we evaluate the proposed approach with respect to the state-of-the-art i-vector extractor on the NIST SRE 2010 evaluation set and on four different length-utterance conditions: 3sec-10sec, 10 sec-30 sec, 30 sec-60 sec and full (untruncated) utterances. Experimental results reveal that while i-vector system performs better on truncated 3sec to 10sec and 10 sec to 30 sec utterances, noticeable improvements are observed with SGMMs especially on full length-utterance durations. Eventually, the proposed SGMM approach exhibits complementary properties and can thus be efficiently fused with i-vector based speaker verification system.
引用
收藏
页码:4445 / 4449
页数:5
相关论文
共 50 条
  • [41] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27
  • [42] ACCURATE SPEAKER RECOGNITION BASED ON ADAPTIVE GAUSSIAN MIXTURE MODEL
    Wang Yunqi
    Yu Yibiao
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 527 - 531
  • [43] Speaker recognition and speaker normalization by projection to speaker subspace
    Ariki, Y
    Tagashira, S
    Nishijima, M
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 319 - 322
  • [44] SPEAKER PHONE MODE CLASSIFICATION USING GAUSSIAN MIXTURE MODELS
    Eghbal-zadeh, H.
    Sobhan-manesh, F.
    Sameti, H.
    BabaAli, B.
    SPA 2011: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2011, : 112 - +
  • [45] Use of Gaussian Mixture Models in Macedonian Forensic Speaker Identification
    Gerazov, Branislav
    Pop-Dimitrijoska, Vesna
    Ivanovski, Zoran
    Apostolovska, Gordana
    2012 20TH TELECOMMUNICATIONS FORUM (TELFOR), 2012, : 724 - 727
  • [46] Analysis of Different Subspace Mixture Models in Handwriting Recognition
    Aradhya, V. N. Manjunath
    Niranjan, S. K.
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 670 - 674
  • [47] ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    ROSE, RC
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 72 - 83
  • [48] Speaker recognition based on dynamic time warping and Gaussian mixture model
    Zhang, Nannan
    Yao, Yanru
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1174 - 1177
  • [49] Bayesian Speaker Recognition Using Gaussian Mixture Model and Laplace Approximation
    Cheng, Shih-Sian
    Chen, I-Fan
    Wang, Hsin-Min
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2738 - +
  • [50] Gaussian mixture language models for speech recognition
    Afify, Mohamed
    Siohan, Olivier
    Sarikaya, Ruhi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 29 - +