A ROBUST SPEAKER CLUSTERING METHOD BASED ON DISCRETE TIED VARIATIONAL AUTOENCODER

被引：0

作者：

Feng, Chen ^{[1
]}

Wang, Jianzong ^{[1
]}

Li, Tongxu ^{[1
]}

Peng, Junqing ^{[1
]}

Xiao, Jing ^{[1
]}

机构：

[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Guangdong, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

speaker clustering; tied variational autoencoder; mutual information; aggregation hierarchy cluster;

D O I：

10.1109/icassp40776.2020.9053488

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as input of probability and linear discriminant analysis model (PLDA) aims to form the distance matric in long voice application scenario, and then clustering results are obtained through the clustering model. However, traditional speaker clustering method based on AHC has the shortcomings of long-time running and remains sensitive to environment noise. In this paper, we propose a novel speaker clustering method based on Mutual Information (MI) and a non-linear model with discrete variable, which under the enlightenment of Tied Variational Autoencoder (TVAE), to enhance the robustness against noise. The proposed method named Discrete Tied Variational Autoencoder (DTVAE) which shortens the elapsed time substantially. With experience results, it outperforms the general model and yields a relative Accuracy (ACC) improvement and significant time reduction.

引用

页码：6024 / 6028

页数：5

共 50 条

[21] A robust variational autoencoder using beta divergence
Akrami, Haleh
Joshi, Anand A.
Li, Jian
Aydore, Sergul
Leahy, Richard M.
KNOWLEDGE-BASED SYSTEMS, 2022, 238
[22] Emotional Speech Clustering based Robust Speaker Recognition System
Li, Dongdong
Yang, Yingchun
PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4576 - +
[23] Deep Multi-View Clustering Based on Distribution Aligned Variational Autoencoder
Xie S.-L.
Chen H.-D.
Gao J.-L.
Peng X.
Yin M.
Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (05): : 945 - 959
[24] Robust speaker clustering quality estimation
Cohen, Yishai
Lapidot, Itshak
2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
[25] Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
Xie, Yuying
Kuhlmann, Michael
Rautenberg, Frederik
Tan, Zheng-Hua
Haeb-Umbach, Reinhold
32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 436 - 440
[26] Dual-Channel Target Speaker Extraction Based on Conditional Variational Autoencoder and Directional Information
Wang, Rui
Li, Li
Toda, Tomoki
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1968 - 1979
[27] Multidimensional degradation data generation method based on variational autoencoder
Lin, Yanhui
Li, Chunbo
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (10): : 2617 - 2627
[28] A Method for Generating Sea Clutter Data Based on Variational Autoencoder
Deng, Xingyu
Hui, Bingwei
Han, Xing
Gao, Fei
Duan, Dawei
2024 9TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, ICSIP, 2024, : 54 - 61
[29] Anomaly Detection Method for MVB Network Based on Variational Autoencoder
Yang Y.
Wang L.
Chen H.
Wang C.
Tiedao Xuebao/Journal of the China Railway Society, 2022, 44 (01): : 71 - 78
[30] A Network Data Reinforcement Method Based on the Multiclass Variational Autoencoder
Qu, Yanze
Ma, Hailong
Jiang, Yiming
Wang, Liang
Yu, Jing
SECURITY AND COMMUNICATION NETWORKS, 2022, 2022

← 1 2 3 4 5 →