OPTIMIZING NEURAL NETWORK EMBEDDINGS USING A PAIR-WISE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dhamyal, Hira [1 ]
Zhou, Tianyan [1 ]
Raj, Bhiksha [1 ]
Singh, Rita [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
quartet loss; embeddings; neural-networks; speaker verification; DISCRIMINANT-ANALYSIS;
D O I
10.1109/asru46091.2019.9003794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new loss function called the "quartet" loss for the better optimization of the neural networks for matching tasks. For such tasks, where neural network embeddings are the key component, the optimization of the network for better embeddings is critical. The embeddings are required to be class discriminative, resulting in minimal inter-class variation and maximal intra-class variation even for unseen classes for better generalization of the network. The quartet loss explicitly computes the distance metric between pairs of inputs and increases the gap between the similarity score distributions between the same class pairs and the different class pairs. We evaluate on the speaker verification task and demonstrate the performance of the loss on our proposed neural network.
引用
收藏
页码:742 / 748
页数:7
相关论文
共 50 条
  • [31] Robust text-independent speaker verification using genetic programming
    Day, Peter
    Nandi, Asoke K.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 285 - 295
  • [32] Wavelet entropy and neural network for text-independent speaker identification
    Daqrouq, Khaled
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (05) : 796 - 802
  • [33] Text-independent speaker verification in embedded environments
    Tydlitat, Borivoj
    Navratil, Jiri
    Pelecanos, Jason W.
    Ramaswamy, Ganesh N.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 293 - +
  • [34] ORTHOGONAL TRAINING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhu, Yingke
    Mak, Brian
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6584 - 6588
  • [35] Research on text-independent speaker recognition methods using wavelet neural network
    Bai, Ying
    Zhao, Zhen-Dong
    Qi, Yin-Cheng
    Wang, Bin
    Guo, Jian-Yong
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2006, 28 (06): : 1036 - 1039
  • [37] Collaborative and adversarial network for text-independent speaker verification in domain adaptation
    Qiang, Junhao
    Yang, Qun
    Gao, Jie
    Liu, Shaohan
    ELECTRONICS LETTERS, 2023, 59 (02)
  • [38] Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition
    Kim, Seong-Hu
    Park, Yong-Hwa
    INTERSPEECH 2021, 2021, : 66 - 70
  • [39] Text-independent speaker verification using speaker clustering and support vector machines
    Hou, FL
    Wang, BX
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 456 - 459
  • [40] Text-independent speaker identification using a hybrid neural network and conformity approach
    Ouzounov, A
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2098 - 2102