A ROBUST SPEAKER CLUSTERING METHOD BASED ON DISCRETE TIED VARIATIONAL AUTOENCODER

被引:0
|
作者
Feng, Chen [1 ]
Wang, Jianzong [1 ]
Li, Tongxu [1 ]
Peng, Junqing [1 ]
Xiao, Jing [1 ]
机构
[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Guangdong, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
关键词
speaker clustering; tied variational autoencoder; mutual information; aggregation hierarchy cluster;
D O I
10.1109/icassp40776.2020.9053488
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as input of probability and linear discriminant analysis model (PLDA) aims to form the distance matric in long voice application scenario, and then clustering results are obtained through the clustering model. However, traditional speaker clustering method based on AHC has the shortcomings of long-time running and remains sensitive to environment noise. In this paper, we propose a novel speaker clustering method based on Mutual Information (MI) and a non-linear model with discrete variable, which under the enlightenment of Tied Variational Autoencoder (TVAE), to enhance the robustness against noise. The proposed method named Discrete Tied Variational Autoencoder (DTVAE) which shortens the elapsed time substantially. With experience results, it outperforms the general model and yields a relative Accuracy (ACC) improvement and significant time reduction.
引用
收藏
页码:6024 / 6028
页数:5
相关论文
共 50 条
  • [21] A robust variational autoencoder using beta divergence
    Akrami, Haleh
    Joshi, Anand A.
    Li, Jian
    Aydore, Sergul
    Leahy, Richard M.
    KNOWLEDGE-BASED SYSTEMS, 2022, 238
  • [22] Emotional Speech Clustering based Robust Speaker Recognition System
    Li, Dongdong
    Yang, Yingchun
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4576 - +
  • [23] Deep Multi-View Clustering Based on Distribution Aligned Variational Autoencoder
    Xie S.-L.
    Chen H.-D.
    Gao J.-L.
    Peng X.
    Yin M.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (05): : 945 - 959
  • [24] Robust speaker clustering quality estimation
    Cohen, Yishai
    Lapidot, Itshak
    2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
  • [25] Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
    Xie, Yuying
    Kuhlmann, Michael
    Rautenberg, Frederik
    Tan, Zheng-Hua
    Haeb-Umbach, Reinhold
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 436 - 440
  • [26] Dual-Channel Target Speaker Extraction Based on Conditional Variational Autoencoder and Directional Information
    Wang, Rui
    Li, Li
    Toda, Tomoki
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1968 - 1979
  • [27] Multidimensional degradation data generation method based on variational autoencoder
    Lin, Yanhui
    Li, Chunbo
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (10): : 2617 - 2627
  • [28] A Method for Generating Sea Clutter Data Based on Variational Autoencoder
    Deng, Xingyu
    Hui, Bingwei
    Han, Xing
    Gao, Fei
    Duan, Dawei
    2024 9TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, ICSIP, 2024, : 54 - 61
  • [29] Anomaly Detection Method for MVB Network Based on Variational Autoencoder
    Yang Y.
    Wang L.
    Chen H.
    Wang C.
    Tiedao Xuebao/Journal of the China Railway Society, 2022, 44 (01): : 71 - 78
  • [30] A Network Data Reinforcement Method Based on the Multiclass Variational Autoencoder
    Qu, Yanze
    Ma, Hailong
    Jiang, Yiming
    Wang, Liang
    Yu, Jing
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022