OPTIMIZING NEURAL NETWORK EMBEDDINGS USING A PAIR-WISE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dhamyal, Hira [1 ]
Zhou, Tianyan [1 ]
Raj, Bhiksha [1 ]
Singh, Rita [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
quartet loss; embeddings; neural-networks; speaker verification; DISCRIMINANT-ANALYSIS;
D O I
10.1109/asru46091.2019.9003794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new loss function called the "quartet" loss for the better optimization of the neural networks for matching tasks. For such tasks, where neural network embeddings are the key component, the optimization of the network for better embeddings is critical. The embeddings are required to be class discriminative, resulting in minimal inter-class variation and maximal intra-class variation even for unseen classes for better generalization of the network. The quartet loss explicitly computes the distance metric between pairs of inputs and increases the gap between the similarity score distributions between the same class pairs and the different class pairs. We evaluate on the speaker verification task and demonstrate the performance of the loss on our proposed neural network.
引用
收藏
页码:742 / 748
页数:7
相关论文
共 50 条
  • [41] Neural network clustering technique for text-independent speaker identification
    Nossair, Zaki B.
    Zahorian, Stephen A.
    Artificial Neural Networks in Engineering - Proceedings (ANNIE'94), 1994, 4 : 453 - 459
  • [42] Pseudo speaker models for text-independent speaker verification using rank threshold
    Chiba University, Chiba, Japan
    NLP-KE - Proc. Int. Conf. Nat. Lang. Process. Knowl. Eng., (265-268):
  • [43] Deep Speaker Feature Learning for Text-independent Speaker Verification
    Li, Lantian
    Chen, Yixiang
    Shi, Zing
    Tang, Zhiyuan
    Wang, Dong
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
  • [44] A Survey on Text-Dependent and Text-Independent Speaker Verification
    Tu, Youzhi
    Lin, Weiwei
    Mak, Man-Wai
    IEEE ACCESS, 2022, 10 : 99038 - 99049
  • [45] BOUNDARY DISCRIMINATIVE LARGE MARGIN COSINE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Li, Rongjin
    Li, Na
    Tuo, Deyi
    Yu, Meng
    Su, Dan
    Yu, Dong
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6321 - 6325
  • [46] Text-Independent Speaker Verification Using Lightweight 3D Convolutional Neural Networks
    Chen, Jyun-Yan
    Jeng, Jin-Tsong
    2024 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING, ICSSE 2024, 2024,
  • [47] Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification
    Qu, Xiaoyang
    Wang, Jianzong
    Xiao, Jing
    INTERSPEECH 2020, 2020, : 961 - 965
  • [48] Generalized locally recurrent probabilistic neural networks for text-independent speaker verification
    Ganchev, T
    Fakotakis, N
    Tasoulis, DK
    Vrahatis, MN
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 41 - 44
  • [49] RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification
    Jung, Jee-weon
    Heo, Hee-Soo
    Kim, Ju-ho
    Shim, Hye-jin
    Yu, Ha-Jin
    INTERSPEECH 2019, 2019, : 1268 - 1272
  • [50] Text-Independent Speaker Verification Using Rank Threshold in Large Number of Speaker Models
    Okamoto, Haruka
    Tsuge, Satoru
    Abdelwahab, Amira
    Nishida, Masafumi
    Horiuchi, Yasuo
    Kuroiwa, Shingo
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2319 - +