Speaker Identification using Triplet Loss Function Combined with Clustering Techniques

被引:0
|
作者
Shalaby, Mohamed [1 ]
Hassan, Mohamed [1 ]
Omar, Yasser M. K. [1 ]
机构
[1] Arab Acad Sci & Technol, Dept Comp Sci, Cairo, Egypt
关键词
Neural network; Speech recognition; triplet loss function; RECOGNITION;
D O I
10.1109/ITMS52826.2021.9615342
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speaker identification plays a critical role in many applications like robotics specially the applications that focus on humanoid robotics. The speaker identification includes comparing unknown utterances against pre-stored utterances of speakers. In general, the encoded features are stored from the pre-known speakers database and 1:N comparisons between the extracted encoded features of the unknown utterances and the pre-stored N known speakers are implemented. Different techniques can be used for these types of comparisons of which cosine similarity is the most used one. However, the more the number of the pre-stored known speakers, the longer the execution time the model will need to finish these comparisons, and hence it may not be suitable for real-time applications. In this paper, we combined previously published Triple Neural Network for speaker identification with clustering techniques on the speakers dataset. We employed different clustering techniques and presented two different methods for comparing unknown utterances against pre-stored utterances. The obtained results showed a significant enhancement in the comparisons time with a few reductions in the obtained accuracy. The proposed approach provided a framework that can represent a trade-off between execution time and obtained accuracy.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
    Rostami, Amir Mohammad
    Homayounpour, Mohammad Mehdi
    Nickabadi, Ahmad
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (07) : 4252 - 4270
  • [42] IDENTIFICATION OF STUDENTS WITH SIMILAR BEHAVIOURAL PATTERNS USING CLUSTERING TECHNIQUES
    Pecuchova, Janka
    Drlik, Martin
    E-LEARNING IN THE TRANSFORMATION OF EDUCATION IN DIGITAL SOCIETY, 2022, 14 : 257 - 267
  • [43] Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
    Amir Mohammad Rostami
    Mohammad Mehdi Homayounpour
    Ahmad Nickabadi
    Circuits, Systems, and Signal Processing, 2023, 42 : 4252 - 4270
  • [44] Stabilisation diagrams: Pole identification using fuzzy clustering techniques
    Scionti, M
    Lanslots, JP
    ADVANCES IN ENGINEERING SOFTWARE, 2005, 36 (11-12) : 768 - 779
  • [45] Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs
    Hassan, Emad S.
    Neyazi, Badawi
    Seddeq, H. S.
    Mahmoud, Adel Zaghloul
    Oshaba, Ahmed S.
    El-Emary, Atef
    Abd El-Samie, Fathi E.
    PLOS ONE, 2024, 19 (02):
  • [46] A sample-proxy dual triplet loss function for object re-identification
    Wu, Hanxiao
    Shen, Fei
    Zhu, Jianqing
    Zeng, Huanqiang
    Zhu, Xiaobin
    Lei, Zhen
    IET IMAGE PROCESSING, 2022, 16 (14) : 3781 - 3789
  • [47] SPEAKER IDENTIFICATION USING MULTILAYER PERCEPTRONS AND RADIAL BASIS FUNCTION NETWORKS
    MAK, MW
    ALLEN, WG
    SEXTON, GG
    NEUROCOMPUTING, 1994, 6 (01) : 99 - 117
  • [48] Automatic Identification of Replicated Criminal Websites Using Combined Clustering
    Drew, Jake
    Moore, Tyler
    2014 IEEE SECURITY AND PRIVACY WORKSHOPS (SPW 2014), 2014, : 116 - 123
  • [49] Robust speaker identification using combined feature selection and missing data recognition
    Pullella, Daniel
    Kuehne, Marco
    Togneri, Roberto
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4833 - 4836
  • [50] Speaker Recognition Based on the Joint Loss Function
    Feng, Tengteng
    Fan, Houbin
    Ge, Fengpei
    Cao, Shuxin
    Liang, Chunyan
    ELECTRONICS, 2023, 12 (16)