Improved i-vector extraction technique for speaker verification with short utterances

被引：2

作者：

Poddar A. ^{[1
]}

Sahidullah M. ^{[2
]}

Saha G. ^{[1
]}

机构：

[1] Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur

[2] Speech and Image Processing Unit, School of Computing, University of Eastern Finland, Joensuu

来源：

International Journal of Speech Technology | 2018年 / 21卷 / 03期

关键词：

Baum–Welch statistics; Duration variability; GMM-UBM; i-Vector; Short utterance; Speaker recognition;

D O I：

10.1007/s10772-017-9477-2

中图分类号：

学科分类号：

摘要：

A major challenge in ASV is to improve performance with short speech segments for end-user convenience in real-world applications. In this paper, we present a detailed analysis of ASV systems to observe the duration variability effects on state-of-the-art i-vector and classical Gaussian mixture model-universal background model (GMM-UBM) based ASV systems. We observe an increase in uncertainty of model parameter estimation for i-vector based ASV with speech of shorter duration. In order to compensate the effect of duration variability in short utterances, we have proposed adaptation technique for Baum-Welch statistics estimation used to i-vector extraction. Information from pre-estimated background model parameters are used for adaptation method. The ASV performance with the proposed approach is considerably superior to the conventional i-vector based system. Furthermore, the fusion of proposed i-vector based system and GMM-UBM further improves the ASV performance, especially for short speech segments. Experiments conducted on two speech corpora, NIST SRE 2008 and 2010, have shown relative improvement in equal error rate (EER) in the range of 12–20%. © 2017, Springer Science+Business Media, LLC, part of Springer Nature.

引用

页码：473 / 488

页数：15

共 50 条

[31] WEIGHTED LDA TECHNIQUES FOR I-VECTOR BASED SPEAKER VERIFICATION
Kanagasundaram, A.
Dean, D.
Vogt, R.
McLaren, M.
Sridharan, S.
Mason, M.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4781 - 4784
[32] PLDA Modeling in I-Vector and Supervector Space for Speaker Verification
Jiang, Ye
Lee, Kong Aik
Tang, Zhenmin
Ma, Bin
Larcher, Anthony
Li, Haizhou
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1678 - 1681
[33] PERFORMANCE OF I-VECTOR SPEAKER VERIFICATION AND THE DETECTION OF SYNTHETIC SPEECH
McClanahan, Richard D.
Stewart, Bryan
De Leon, Phillip L.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[34] SPEAKER VERIFICATION USING SIMPLIFIED AND SUPERVISED I-VECTOR MODELING
Li, Ming
Tsiartas, Andreas
Van Segbroeck, Maarten
Narayanan, Shrikanth S.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7199 - 7203
[35] Bayesian Distance Metric Learning on i-vector for Speaker Verification
Fang, Xiao
Dehak, Najim
Glass, James
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2513 - 2517
[36] Non-linear PLDA for i-Vector Speaker Verification
Novoselov, Sergey
Pekhovsky, Timur
Kudashev, Oleg
Mendelev, Valentin
Prudnikov, Alexey
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
[37] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
Zhang, Jiacen
Inoue, Nakamasa
Shinoda, Koichi
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
[38] I-vector Extraction for Speaker Recognition Based on Dimensionality Reduction
Ibrahim, Noor Salwani
Ramli, Dzati Athiar
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 1534 - 1540
[39] Effect of long-term ageing on i-vector speaker verification
Kelly, Finnian
Saeidi, Rahim
Harte, Naomi
van Leeuwen, David
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 86 - 90
[40] Discriminant Analysis Methods Comparison in I-Vector Space for Speaker Verification
Mohammadi, Mohsen
Mohammadi, Hamid Reza Sadegh
2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 166 - 172

← 1 2 3 4 5 →