Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

被引:8
|
作者
You, Lanhua [1 ]
Guo, Wu [1 ]
Dai, Li-Rong [1 ]
Du, Jun [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Speaker verification; High-order statistics; X-vector; Multi-task learning; Unsupervised learning;
D O I
10.21437/Interspeech.2019-2264
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the primary task of classifying the target speakers, and the auxiliary task of reconstructing the first- and higher-order statistics of the original input utterance. The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. Experiments are carried out using the NIST SRE16 evaluation dataset and the VOiCES dataset. The results demonstrate that our proposed method outperforms the original x-vector approach with very low additional complexity added.
引用
收藏
页码:1158 / 1162
页数:5
相关论文
共 50 条
  • [31] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
    Shum, Stephen
    Dehak, Najim
    Dehak, Reda
    Glass, James R.
    ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
  • [32] A novel text-independent speaker verification method based on the global speaker model
    Zhang, YY
    Zhang, D
    Zhu, XY
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2000, 30 (05): : 598 - 602
  • [33] CONTRASTIVE SELF-SUPERVISED LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhang, Haoran
    Zou, Yuexian
    Wang, Helin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6713 - 6717
  • [34] ADAPTATION OF PLDA FOR MULTI-SOURCE TEXT-INDEPENDENT SPEAKER VERIFICATION
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Ma, Long
    Li, Haizhou
    Dai, Li-Rong
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5380 - 5384
  • [35] End-to-End Feature Learning for Text-Independent Speaker Verification
    Chen, Fangzhou
    Bian, Tengyue
    Xu, Li
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3949 - 3954
  • [36] Text-independent speaker identification based on support vector machines
    He, Xin
    Liu, Chongqing
    Li, Jiegu
    Jisuanji Gongcheng/Computer Engineering, 2000, 26 (06): : 61 - 63
  • [37] MULTI-FEATURE LEARNING WITH CANONICAL CORRELATION ANALYSIS CONSTRAINT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Li, Zheng
    Zhao, Miao
    Li, Lin
    Hong, Qingyang
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 330 - 337
  • [38] I-vector Based Text-Independent Speaker Identification
    Liu, Tingting
    Kang, Kai
    Guan, Shengxiao
    2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 5420 - 5425
  • [39] A Phoneme Localization Based Liveness Detection for Text-Independent Speaker Verification
    Zhang, Linghan
    Tan, Sheng
    Chen, Yingying
    Yang, Jie
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 5611 - 5624
  • [40] Quasi Text-Independent Speaker-Verification based on Pattern Matching
    Gerber, Michael
    Beutler, Rene
    Pfister, Beat
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 93 - 96