Improved i-vector extraction technique for speaker verification with short utterances

被引:2
|
作者
Poddar A. [1 ]
Sahidullah M. [2 ]
Saha G. [1 ]
机构
[1] Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur
[2] Speech and Image Processing Unit, School of Computing, University of Eastern Finland, Joensuu
关键词
Baum–Welch statistics; Duration variability; GMM-UBM; i-Vector; Short utterance; Speaker recognition;
D O I
10.1007/s10772-017-9477-2
中图分类号
学科分类号
摘要
A major challenge in ASV is to improve performance with short speech segments for end-user convenience in real-world applications. In this paper, we present a detailed analysis of ASV systems to observe the duration variability effects on state-of-the-art i-vector and classical Gaussian mixture model-universal background model (GMM-UBM) based ASV systems. We observe an increase in uncertainty of model parameter estimation for i-vector based ASV with speech of shorter duration. In order to compensate the effect of duration variability in short utterances, we have proposed adaptation technique for Baum-Welch statistics estimation used to i-vector extraction. Information from pre-estimated background model parameters are used for adaptation method. The ASV performance with the proposed approach is considerably superior to the conventional i-vector based system. Furthermore, the fusion of proposed i-vector based system and GMM-UBM further improves the ASV performance, especially for short speech segments. Experiments conducted on two speech corpora, NIST SRE 2008 and 2010, have shown relative improvement in equal error rate (EER) in the range of 12–20%. © 2017, Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:473 / 488
页数:15
相关论文
共 50 条
  • [31] WEIGHTED LDA TECHNIQUES FOR I-VECTOR BASED SPEAKER VERIFICATION
    Kanagasundaram, A.
    Dean, D.
    Vogt, R.
    McLaren, M.
    Sridharan, S.
    Mason, M.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4781 - 4784
  • [32] PLDA Modeling in I-Vector and Supervector Space for Speaker Verification
    Jiang, Ye
    Lee, Kong Aik
    Tang, Zhenmin
    Ma, Bin
    Larcher, Anthony
    Li, Haizhou
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1678 - 1681
  • [33] PERFORMANCE OF I-VECTOR SPEAKER VERIFICATION AND THE DETECTION OF SYNTHETIC SPEECH
    McClanahan, Richard D.
    Stewart, Bryan
    De Leon, Phillip L.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [34] SPEAKER VERIFICATION USING SIMPLIFIED AND SUPERVISED I-VECTOR MODELING
    Li, Ming
    Tsiartas, Andreas
    Van Segbroeck, Maarten
    Narayanan, Shrikanth S.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7199 - 7203
  • [35] Bayesian Distance Metric Learning on i-vector for Speaker Verification
    Fang, Xiao
    Dehak, Najim
    Glass, James
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2513 - 2517
  • [36] Non-linear PLDA for i-Vector Speaker Verification
    Novoselov, Sergey
    Pekhovsky, Timur
    Kudashev, Oleg
    Mendelev, Valentin
    Prudnikov, Alexey
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
  • [37] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
    Zhang, Jiacen
    Inoue, Nakamasa
    Shinoda, Koichi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
  • [38] I-vector Extraction for Speaker Recognition Based on Dimensionality Reduction
    Ibrahim, Noor Salwani
    Ramli, Dzati Athiar
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 1534 - 1540
  • [39] Effect of long-term ageing on i-vector speaker verification
    Kelly, Finnian
    Saeidi, Rahim
    Harte, Naomi
    van Leeuwen, David
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 86 - 90
  • [40] Discriminant Analysis Methods Comparison in I-Vector Space for Speaker Verification
    Mohammadi, Mohsen
    Mohammadi, Hamid Reza Sadegh
    2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 166 - 172