Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

被引：8

作者：

You, Lanhua ^{[1
]}

Guo, Wu ^{[1
]}

Dai, Li-Rong ^{[1
]}

Du, Jun ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China

来源：

INTERSPEECH 2019 | 2019年

基金：

中国国家自然科学基金;

关键词：

Speaker verification; High-order statistics; X-vector; Multi-task learning; Unsupervised learning;

D O I：

10.21437/Interspeech.2019-2264

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the primary task of classifying the target speakers, and the auxiliary task of reconstructing the first- and higher-order statistics of the original input utterance. The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. Experiments are carried out using the NIST SRE16 evaluation dataset and the VOiCES dataset. The results demonstrate that our proposed method outperforms the original x-vector approach with very low additional complexity added.

引用

页码：1158 / 1162

页数：5

共 50 条

[31] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
Shum, Stephen
Dehak, Najim
Dehak, Reda
Glass, James R.
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
[32] A novel text-independent speaker verification method based on the global speaker model
Zhang, YY
Zhang, D
Zhu, XY
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2000, 30 (05): : 598 - 602
[33] CONTRASTIVE SELF-SUPERVISED LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Zhang, Haoran
Zou, Yuexian
Wang, Helin
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6713 - 6717
[34] ADAPTATION OF PLDA FOR MULTI-SOURCE TEXT-INDEPENDENT SPEAKER VERIFICATION
Chen, Liping
Lee, Kong Aik
Ma, Bin
Ma, Long
Li, Haizhou
Dai, Li-Rong
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5380 - 5384
[35] End-to-End Feature Learning for Text-Independent Speaker Verification
Chen, Fangzhou
Bian, Tengyue
Xu, Li
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3949 - 3954
[36] Text-independent speaker identification based on support vector machines
He, Xin
Liu, Chongqing
Li, Jiegu
Jisuanji Gongcheng/Computer Engineering, 2000, 26 (06): : 61 - 63
[37] MULTI-FEATURE LEARNING WITH CANONICAL CORRELATION ANALYSIS CONSTRAINT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Li, Zheng
Zhao, Miao
Li, Lin
Hong, Qingyang
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 330 - 337
[38] I-vector Based Text-Independent Speaker Identification
Liu, Tingting
Kang, Kai
Guan, Shengxiao
2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 5420 - 5425
[39] A Phoneme Localization Based Liveness Detection for Text-Independent Speaker Verification
Zhang, Linghan
Tan, Sheng
Chen, Yingying
Yang, Jie
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 5611 - 5624
[40] Quasi Text-Independent Speaker-Verification based on Pattern Matching
Gerber, Michael
Beutler, Rene
Pfister, Beat
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 93 - 96

← 1 2 3 4 5 →