A ROBUST SPEAKER CLUSTERING METHOD BASED ON DISCRETE TIED VARIATIONAL AUTOENCODER

被引：0

作者：

Feng, Chen ^{[1
]}

Wang, Jianzong ^{[1
]}

Li, Tongxu ^{[1
]}

Peng, Junqing ^{[1
]}

Xiao, Jing ^{[1
]}

机构：

[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Guangdong, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

speaker clustering; tied variational autoencoder; mutual information; aggregation hierarchy cluster;

D O I：

10.1109/icassp40776.2020.9053488

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as input of probability and linear discriminant analysis model (PLDA) aims to form the distance matric in long voice application scenario, and then clustering results are obtained through the clustering model. However, traditional speaker clustering method based on AHC has the shortcomings of long-time running and remains sensitive to environment noise. In this paper, we propose a novel speaker clustering method based on Mutual Information (MI) and a non-linear model with discrete variable, which under the enlightenment of Tied Variational Autoencoder (TVAE), to enhance the robustness against noise. The proposed method named Discrete Tied Variational Autoencoder (DTVAE) which shortens the elapsed time substantially. With experience results, it outperforms the general model and yields a relative Accuracy (ACC) improvement and significant time reduction.

引用

页码：6024 / 6028

页数：5

共 50 条

[1] Tied Variational Autoencoder Backends for i-Vector Speaker Recognition
Villalba, Jesus
Brummer, Niko
Dehak, Najim
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1004 - 1008
[2] An Active Learning Method Based on Variational Autoencoder and DBSCAN Clustering
Chen, Fang
Zhang, Tao
Liu, Ruilin
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[3] Covariance-tied clustering method in speaker identification
Wang, ZQ
Liu, Y
Ding, P
Bo, X
FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 81 - 84
[4] Variational autoencoder for prosody-based speaker recognition
Ben Alex, Starlet
Mary, Leena
ETRI JOURNAL, 2023, 45 (04) : 678 - 689
[5] Deep Clustering With Variational Autoencoder
Lim, Kart-Leong
Jiang, Xudong
Yi, Chenyu
IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 231 - 235
[6] Research on load clustering algorithm based on variational autoencoder and hierarchical clustering
Cai, Miaozhuang
Zheng, Yin
Peng, Zhengyang
Huang, Chunyan
Jiang, Haoxia
PLOS ONE, 2024, 19 (06):
[7] Speaker normalization using Joint Variational Autoencoder
Kumar, Shashi
Rath, Shakti P.
Pandey, Abhishek
INTERSPEECH 2021, 2021, : 1289 - 1293
[8] Discriminative Feature Extraction Based on Sequential Variational Autoencoder for Speaker Recognition
Yoshimura, Takenori
Koike, Natsumi
Hashimoto, Kei
Oura, Keiichiro
Nankaku, Yoshihiko
Tokuda, Keiichi
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1742 - 1746
[9] Data Augmentation using Variational Autoencoder for Embedding based Speaker Verification
Wu, Zhanghao
Wang, Shuai
Qian, Yanmin
Yu, Kai
INTERSPEECH 2019, 2019, : 1163 - 1167
[10] An Improved Variational Autoencoder-Based Clustering Method for Pan-Cancer Diagnosis and Subtyping
Tang, Binhua
Nie, Jiafei
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 509 - 520

← 1 2 3 4 5 →