Speaker Identification using Triplet Loss Function Combined with Clustering Techniques

被引：0

作者：

Shalaby, Mohamed ^{[1
]}

Hassan, Mohamed ^{[1
]}

Omar, Yasser M. K. ^{[1
]}

机构：

[1] Arab Acad Sci & Technol, Dept Comp Sci, Cairo, Egypt

来源：

2021 62ND INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATION TECHNOLOGY AND MANAGEMENT SCIENCE OF RIGA TECHNICAL UNIVERSITY (ITMS) | 2021年

关键词：

Neural network; Speech recognition; triplet loss function; RECOGNITION;

D O I：

10.1109/ITMS52826.2021.9615342

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speaker identification plays a critical role in many applications like robotics specially the applications that focus on humanoid robotics. The speaker identification includes comparing unknown utterances against pre-stored utterances of speakers. In general, the encoded features are stored from the pre-known speakers database and 1:N comparisons between the extracted encoded features of the unknown utterances and the pre-stored N known speakers are implemented. Different techniques can be used for these types of comparisons of which cosine similarity is the most used one. However, the more the number of the pre-stored known speakers, the longer the execution time the model will need to finish these comparisons, and hence it may not be suitable for real-time applications. In this paper, we combined previously published Triple Neural Network for speaker identification with clustering techniques on the speakers dataset. We employed different clustering techniques and presented two different methods for comparing unknown utterances against pre-stored utterances. The obtained results showed a significant enhancement in the comparisons time with a few reductions in the obtained accuracy. The proposed approach provided a framework that can represent a trade-off between execution time and obtained accuracy.

引用

页数：5

共 50 条

[41] Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
Rostami, Amir Mohammad
Homayounpour, Mohammad Mehdi
Nickabadi, Ahmad
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (07) : 4252 - 4270
[42] IDENTIFICATION OF STUDENTS WITH SIMILAR BEHAVIOURAL PATTERNS USING CLUSTERING TECHNIQUES
Pecuchova, Janka
Drlik, Martin
E-LEARNING IN THE TRANSFORMATION OF EDUCATION IN DIGITAL SOCIETY, 2022, 14 : 257 - 267
[43] Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
Amir Mohammad Rostami
Mohammad Mehdi Homayounpour
Ahmad Nickabadi
Circuits, Systems, and Signal Processing, 2023, 42 : 4252 - 4270
[44] Stabilisation diagrams: Pole identification using fuzzy clustering techniques
Scionti, M
Lanslots, JP
ADVANCES IN ENGINEERING SOFTWARE, 2005, 36 (11-12) : 768 - 779
[45] Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs
Hassan, Emad S.
Neyazi, Badawi
Seddeq, H. S.
Mahmoud, Adel Zaghloul
Oshaba, Ahmed S.
El-Emary, Atef
Abd El-Samie, Fathi E.
PLOS ONE, 2024, 19 (02):
[46] A sample-proxy dual triplet loss function for object re-identification
Wu, Hanxiao
Shen, Fei
Zhu, Jianqing
Zeng, Huanqiang
Zhu, Xiaobin
Lei, Zhen
IET IMAGE PROCESSING, 2022, 16 (14) : 3781 - 3789
[47] SPEAKER IDENTIFICATION USING MULTILAYER PERCEPTRONS AND RADIAL BASIS FUNCTION NETWORKS
MAK, MW
ALLEN, WG
SEXTON, GG
NEUROCOMPUTING, 1994, 6 (01) : 99 - 117
[48] Automatic Identification of Replicated Criminal Websites Using Combined Clustering
Drew, Jake
Moore, Tyler
2014 IEEE SECURITY AND PRIVACY WORKSHOPS (SPW 2014), 2014, : 116 - 123
[49] Robust speaker identification using combined feature selection and missing data recognition
Pullella, Daniel
Kuehne, Marco
Togneri, Roberto
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4833 - 4836
[50] Speaker Recognition Based on the Joint Loss Function
Feng, Tengteng
Fan, Houbin
Ge, Fengpei
Cao, Shuxin
Liang, Chunyan
ELECTRONICS, 2023, 12 (16)

← 1 2 3 4 5 →