Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification

被引:8
|
作者
You, Lanhua [1 ]
Guo, Wu [1 ]
Dai, Li-Rong [1 ]
Du, Jun [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Speaker verification; High-order statistics; X-vector; Multi-task learning; Unsupervised learning;
D O I
10.21437/Interspeech.2019-2264
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. This paper presents a multi-task learning architecture for training the speaker embedding DNN with the primary task of classifying the target speakers, and the auxiliary task of reconstructing the first- and higher-order statistics of the original input utterance. The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. Experiments are carried out using the NIST SRE16 evaluation dataset and the VOiCES dataset. The results demonstrate that our proposed method outperforms the original x-vector approach with very low additional complexity added.
引用
收藏
页码:1158 / 1162
页数:5
相关论文
共 50 条
  • [21] Improving X-vector and PLDA for Text-dependent Speaker Verification
    Chen, Zhuxin
    Lin, Yue
    INTERSPEECH 2020, 2020, : 726 - 730
  • [22] GENERATIVE X-VECTORS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Xu, Longting
    Das, Rohan Kumar
    Yilmaz, Emre
    Yang, Jichen
    Li, Haizhou
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1014 - 1020
  • [23] TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES
    Liu, Kai
    Zhou, Huan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6569 - 6573
  • [24] Group-based speaker embeddings for text-independent speaker verification
    Jung, Youngmoon
    Eom, Youngsik
    Lee, Yeonghyeon
    Kim, Hoirin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
  • [25] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
    Kashani, Hamidreza Baradaran
    Reza, Shaghayegh
    Rezaei, Iman Sarraf
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [26] Triplet Based Embedding Distance and Similarity Learning for Text-independent Speaker Verification
    Ren, Zongze
    Chen, Zhiyong
    Xu, Shugong
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 558 - 562
  • [27] DEEP BOTTLENECK FEATURES FOR I-VECTOR BASED TEXT-INDEPENDENT SPEAKER VERIFICATION
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 555 - 560
  • [28] Learning Vector Quantization in text-independent Automatic Speaker Recognition
    Filgueiras, TE
    Messina, RO
    Cabral, EF
    VTH BRAZILIAN SYMPOSIUM ON NEURAL NETWORKS, PROCEEDINGS, 1998, : 135 - 139
  • [29] A Text-Independent Speaker Verification System Based on Cross Entropy
    Lu, Xiaochun
    Yin, Junxun
    COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
  • [30] Text-independent speaker verification based on relation of MFCC components
    Ou, GW
    Ke, DF
    2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 57 - 60