A ROBUST SPEAKER CLUSTERING METHOD BASED ON DISCRETE TIED VARIATIONAL AUTOENCODER

被引:0
|
作者
Feng, Chen [1 ]
Wang, Jianzong [1 ]
Li, Tongxu [1 ]
Peng, Junqing [1 ]
Xiao, Jing [1 ]
机构
[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Guangdong, Peoples R China
关键词
speaker clustering; tied variational autoencoder; mutual information; aggregation hierarchy cluster;
D O I
10.1109/icassp40776.2020.9053488
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as input of probability and linear discriminant analysis model (PLDA) aims to form the distance matric in long voice application scenario, and then clustering results are obtained through the clustering model. However, traditional speaker clustering method based on AHC has the shortcomings of long-time running and remains sensitive to environment noise. In this paper, we propose a novel speaker clustering method based on Mutual Information (MI) and a non-linear model with discrete variable, which under the enlightenment of Tied Variational Autoencoder (TVAE), to enhance the robustness against noise. The proposed method named Discrete Tied Variational Autoencoder (DTVAE) which shortens the elapsed time substantially. With experience results, it outperforms the general model and yields a relative Accuracy (ACC) improvement and significant time reduction.
引用
收藏
页码:6024 / 6028
页数:5
相关论文
共 50 条
  • [1] Tied Variational Autoencoder Backends for i-Vector Speaker Recognition
    Villalba, Jesus
    Brummer, Niko
    Dehak, Najim
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1004 - 1008
  • [2] An Active Learning Method Based on Variational Autoencoder and DBSCAN Clustering
    Chen, Fang
    Zhang, Tao
    Liu, Ruilin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [3] Covariance-tied clustering method in speaker identification
    Wang, ZQ
    Liu, Y
    Ding, P
    Bo, X
    FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 81 - 84
  • [4] Variational autoencoder for prosody-based speaker recognition
    Ben Alex, Starlet
    Mary, Leena
    ETRI JOURNAL, 2023, 45 (04) : 678 - 689
  • [5] Deep Clustering With Variational Autoencoder
    Lim, Kart-Leong
    Jiang, Xudong
    Yi, Chenyu
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 231 - 235
  • [6] Research on load clustering algorithm based on variational autoencoder and hierarchical clustering
    Cai, Miaozhuang
    Zheng, Yin
    Peng, Zhengyang
    Huang, Chunyan
    Jiang, Haoxia
    PLOS ONE, 2024, 19 (06):
  • [7] Speaker normalization using Joint Variational Autoencoder
    Kumar, Shashi
    Rath, Shakti P.
    Pandey, Abhishek
    INTERSPEECH 2021, 2021, : 1289 - 1293
  • [8] Discriminative Feature Extraction Based on Sequential Variational Autoencoder for Speaker Recognition
    Yoshimura, Takenori
    Koike, Natsumi
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1742 - 1746
  • [9] Data Augmentation using Variational Autoencoder for Embedding based Speaker Verification
    Wu, Zhanghao
    Wang, Shuai
    Qian, Yanmin
    Yu, Kai
    INTERSPEECH 2019, 2019, : 1163 - 1167
  • [10] An Improved Variational Autoencoder-Based Clustering Method for Pan-Cancer Diagnosis and Subtyping
    Tang, Binhua
    Nie, Jiafei
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 509 - 520