OPTIMIZING NEURAL NETWORK EMBEDDINGS USING A PAIR-WISE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Dhamyal, Hira ^{[1
]}

Zhou, Tianyan ^{[1
]}

Raj, Bhiksha ^{[1
]}

Singh, Rita ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年

关键词：

quartet loss; embeddings; neural-networks; speaker verification; DISCRIMINANT-ANALYSIS;

D O I：

10.1109/asru46091.2019.9003794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a new loss function called the "quartet" loss for the better optimization of the neural networks for matching tasks. For such tasks, where neural network embeddings are the key component, the optimization of the network for better embeddings is critical. The embeddings are required to be class discriminative, resulting in minimal inter-class variation and maximal intra-class variation even for unseen classes for better generalization of the network. The quartet loss explicitly computes the distance metric between pairs of inputs and increases the gap between the similarity score distributions between the same class pairs and the different class pairs. We evaluate on the speaker verification task and demonstrate the performance of the loss on our proposed neural network.

引用

页码：742 / 748

页数：7

共 50 条

[31] Robust text-independent speaker verification using genetic programming
Day, Peter
Nandi, Asoke K.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 285 - 295
[32] Wavelet entropy and neural network for text-independent speaker identification
Daqrouq, Khaled
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (05) : 796 - 802
[33] Text-independent speaker verification in embedded environments
Tydlitat, Borivoj
Navratil, Jiri
Pelecanos, Jason W.
Ramaswamy, Ganesh N.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 293 - +
[34] ORTHOGONAL TRAINING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Zhu, Yingke
Mak, Brian
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6584 - 6588
[35] Research on text-independent speaker recognition methods using wavelet neural network
Bai, Ying
Zhao, Zhen-Dong
Qi, Yin-Cheng
Wang, Bin
Guo, Jian-Yong
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2006, 28 (06): : 1036 - 1039
[36] Adaptive method for text-independent speaker verification
Zhang, Yiying, 2000, (11):
[37] Collaborative and adversarial network for text-independent speaker verification in domain adaptation
Qiang, Junhao
Yang, Qun
Gao, Jie
Liu, Shaohan
ELECTRONICS LETTERS, 2023, 59 (02)
[38] Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition
Kim, Seong-Hu
Park, Yong-Hwa
INTERSPEECH 2021, 2021, : 66 - 70
[39] Text-independent speaker verification using speaker clustering and support vector machines
Hou, FL
Wang, BX
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 456 - 459
[40] Text-independent speaker identification using a hybrid neural network and conformity approach
Ouzounov, A
1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2098 - 2102

← 1 2 3 4 5 →