Improved i-vector extraction technique for speaker verification with short utterances

被引:2
|
作者
Poddar A. [1 ]
Sahidullah M. [2 ]
Saha G. [1 ]
机构
[1] Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur
[2] Speech and Image Processing Unit, School of Computing, University of Eastern Finland, Joensuu
关键词
Baum–Welch statistics; Duration variability; GMM-UBM; i-Vector; Short utterance; Speaker recognition;
D O I
10.1007/s10772-017-9477-2
中图分类号
学科分类号
摘要
A major challenge in ASV is to improve performance with short speech segments for end-user convenience in real-world applications. In this paper, we present a detailed analysis of ASV systems to observe the duration variability effects on state-of-the-art i-vector and classical Gaussian mixture model-universal background model (GMM-UBM) based ASV systems. We observe an increase in uncertainty of model parameter estimation for i-vector based ASV with speech of shorter duration. In order to compensate the effect of duration variability in short utterances, we have proposed adaptation technique for Baum-Welch statistics estimation used to i-vector extraction. Information from pre-estimated background model parameters are used for adaptation method. The ASV performance with the proposed approach is considerably superior to the conventional i-vector based system. Furthermore, the fusion of proposed i-vector based system and GMM-UBM further improves the ASV performance, especially for short speech segments. Experiments conducted on two speech corpora, NIST SRE 2008 and 2010, have shown relative improvement in equal error rate (EER) in the range of 12–20%. © 2017, Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:473 / 488
页数:15
相关论文
共 50 条
  • [1] An improved i-vector extraction algorithm for speaker verification
    Wei Li
    Tianfan Fu
    Jie Zhu
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [2] An improved i-vector extraction algorithm for speaker verification
    Li, Wei
    Fu, Tianfan
    Zhu, Jie
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 9
  • [3] An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
    Poddar, Arnab
    Sahidullah, Md
    Saha, Goutam
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 326 - 332
  • [4] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +
  • [5] DNN i-vector Speaker Verification with Short, Text-constrained Test Utterances
    Zhong, Jinghua
    Hu, Wenping
    Soong, Frank
    Meng, Helen
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1507 - 1511
  • [6] GMM and i-vector based speaker verification using speaker-specific-text for short utterances
    Bharathi, B.
    Nagarajan, T.
    2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [7] Deep neural network based i-vector mapping for speaker verification using short utterances
    Guo, Jinxi
    Xu, Ning
    Qian, Kailun
    Shi, Yang
    Xu, Kaiyuan
    Wu, Yingnian
    Alwan, Abeer
    SPEECH COMMUNICATION, 2018, 105 : 92 - 102
  • [8] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [9] An I-Vector Backend for Speaker Verification
    Kenny, Patrick
    Stafylakis, Themos
    Alam, Jahangir
    Kockmann, Marcel
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2307 - 2311
  • [10] Nonparametrically trained PLDA for short duration i-vector speaker verification
    Khosravani, Abbas
    Homayounpour, Mohammad M.
    COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 105 - 122