PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Bai, Zhongxin ^{[1
,2
]}

Zhang, Xiao-Lei ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

以色列科学基金会; 美国国家科学基金会;

关键词：

speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;

D O I：

10.1109/icassp40776.2020.9053674

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

引用

页码：6819 / 6823

页数：5

共 50 条

[1] Group-based speaker embeddings for text-independent speaker verification
Jung, Youngmoon
Eom, Youngsik
Lee, Yeonghyeon
Kim, Hoirin
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
[2] Deep Speaker Feature Learning for Text-independent Speaker Verification
Li, Lantian
Chen, Yixiang
Shi, Zing
Tang, Zhiyuan
Wang, Dong
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
[3] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Gupta, Vishwa
Kenny, Patrick
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592
[4] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[5] Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
Zhu, Yingke
Ko, Tom
Snyder, David
Mak, Brian
Povey, Daniel
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3573 - 3577
[6] Bayesian Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
Zhu, Yingke
Mak, Brian
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1000 - 1012
[7] Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
You, Lanhua
Guo, Wu
Dai, Li-Rong
Du, Jun
INTERSPEECH 2019, 2019, : 1168 - 1172
[8] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
Kashani, Hamidreza Baradaran
Reza, Shaghayegh
Rezaei, Iman Sarraf
2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
[9] A ROBUST TEXT-INDEPENDENT SPEAKER VERIFICATION METHOD BASED ON SPEECH SEPARATION AND DEEP SPEAKER
Zhao, Fei
Li, Hao
Zhang, Xueliang
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6101 - 6105
[10] Text-Independent Speaker Verification Based on Information Theoretic Learning
Memon, Sheeraz
Khanzada, Tariq Jameel Saifullah
Bhatti, Sania
MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468

← 1 2 3 4 5 →