PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Bai, Zhongxin ^{[1
,2
]}

Zhang, Xiao-Lei ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

以色列科学基金会; 美国国家科学基金会;

关键词：

speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;

D O I：

10.1109/icassp40776.2020.9053674

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

引用

页码：6819 / 6823

页数：5

共 50 条

[41] Text-independent speaker verification using speaker clustering and support vector machines
Hou, FL
Wang, BX
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 456 - 459
[42] A Survey on Text-Dependent and Text-Independent Speaker Verification
Tu, Youzhi
Lin, Weiwei
Mak, Man-Wai
IEEE ACCESS, 2022, 10 : 99038 - 99049
[43] A text-independent speaker verification model: A comparative analysis
Charan, Rishi
Manisha, A.
Karthik, R.
Kumar, Rajesh M.
PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
[44] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Chen, Liping
Lee, Kong Aik
Ma, Bin
Guo, Wu
Li, Haizhou
Dai, Li Rong
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255
[45] Neural Embedding Extractors for Text-Independent Speaker Verification
Alam, Jahangir
Kang, Woohyun
Fathan, Abderrahim
SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 10 - 23
[46] Residual Factor Analysis for Text-independent Speaker Verification
Zhu, Lei
Zheng, Rong
Xu, Bo
PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 964 - 968
[47] Speaker Verification by Partial AUC Optimization With Mahalanobis Distance Metric Learning
Bai, Zhongxin
Zhang, Xiao-Lei
Chen, Jingdong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1533 - 1548
[48] Pseudo speaker models for text-independent speaker verification using rank threshold
Chiba University, Chiba, Japan
NLP-KE - Proc. Int. Conf. Nat. Lang. Process. Knowl. Eng., (265-268):
[49] CNN WITH PHONETIC ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Zhou, Tianyan
Zhao, Yong
Li, Jinyu
Gong, Yifan
Wu, Jian
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 718 - 725
[50] Text-Independent Speaker Verification with Dual Attention Network
Li, Jingyu
Lee, Tan
INTERSPEECH 2020, 2020, : 956 - 960

← 1 2 3 4 5 →