Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport

被引:0
|
作者
Zhang, Ruiteng [1 ]
Wei, Jianguo [1 ,2 ]
Lu, Xugang [3 ]
Lu, Wenhuan [1 ]
Jin, Di [1 ]
Zhang, Lin [4 ]
Xu, Junhai [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[2] Qinghai Nationalities Univ, Comp Coll, Xining 810007, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
[4] Brno Univ Technol, Brno 61266, Czech Republic
基金
中国国家自然科学基金;
关键词
Speaker recognition; unsupervised domain adaptation; optimal transport; coupling regularization; DOMAIN ADAPTATION; NEURAL-NETWORKS;
D O I
10.1109/TASLP.2024.3426934
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cross-domain speaker recognition (SR) can be improved by unsupervised domain adaptation (UDA) algorithms. UDA algorithms often reduce domain mismatch at the cost of decreasing the discrimination of speaker features. In contrast, optimal transport (OT) has the potential to achieve domain alignment while preserving the speaker discrimination capability in UDA applications; however, naively applying OT to measure global probability distribution discrepancies between the source and target domains may induce negative transports where samples belonging to different speakers are coupled in transportation. These negative transports reduce the SR model's discriminative power, degrading the SR performance. This paper proposes a coupling-regularized optimal transport (CROT) algorithm for cross-domain SR to reduce the negative transport during UDA. In the proposed CROT, two consecutive processing modules regularize the coupling paths for the OT solution: a progressive inter-speaker constraint (PISC) module and a coupling-smoothed regularization (CSR) module. The PISC, designed as a pseudo-label memory bank with curriculum learning, is first applied to select valid samples to guarantee that coupling samples are from the same speaker. The CSR, designed to control the information entropy of the coupling paths further, reduces the effect of negative transport in UDA. To evaluate the effectiveness of the proposed algorithm, cross-domain SR experiments were conducted under different target domains, speaker encoders, corpora, and acoustic features. Experimental results showed that CROT achieved a 50% relative reduction in equal error rates compared to conventional OT-based UDAs, outperforming the state-of-the-art UDAs.
引用
收藏
页码:3603 / 3617
页数:15
相关论文
共 50 条
  • [1] Adaptive systems for unsupervised speaker tracking and speech recognition
    Herbig, Tobias
    Gerl, Franz
    Minker, Wolfgang
    Haeb-Umbach, Reinhold
    EVOLVING SYSTEMS, 2011, 2 (03) : 199 - 214
  • [2] Unsupervised Domain Adaptation with Regularized Optimal Transport for Multimodal 2D+3D Facial Expression Recognition
    Wei, Xiaofan
    Li, Huibin
    Sun, Jian
    Chen, Liming
    PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 31 - 37
  • [3] UNSUPERVISED IDIOLECT DISCOVERY FOR SPEAKER RECOGNITION
    Jansen, Aren
    Garcia-Romero, Daniel
    Clark, Pascal
    Hernandez-Cordero, Jaime
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition
    Zhang, Ruiteng
    Wei, Jianguo
    Lu, Xugang
    Li, Yongwei
    Xu, Junhai
    Jin, Di
    Tao, Jianhua
    INTERSPEECH 2023, 2023, : 1858 - 1862
  • [5] On robustness of unsupervised domain adaptation for speaker recognition
    Bousquet, Pierre-Michel
    Rouvier, Mickael
    INTERSPEECH 2019, 2019, : 2958 - 2962
  • [6] An Adaptive Threshold Computation for Unsupervised Speaker Segmentation
    Docio-Fernandez, Laura
    Lopez-Otero, Paula
    Garcia-Mateo, Carmen
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 860 - 863
  • [7] Regularized Discrete Optimal Transport
    Ferradans, Sira
    Papadakis, Nicolas
    Peyre, Gabriel
    Aujol, Jean-Francois
    SIAM JOURNAL ON IMAGING SCIENCES, 2014, 7 (03): : 1853 - 1882
  • [8] Quadratically Regularized Optimal Transport
    Dirk A. Lorenz
    Paul Manns
    Christian Meyer
    Applied Mathematics & Optimization, 2021, 83 : 1919 - 1949
  • [9] Semidual Regularized Optimal Transport
    Cuturi, Marco
    Peyre, Gabriel
    SIAM REVIEW, 2018, 60 (04) : 941 - 965
  • [10] Quadratically Regularized Optimal Transport
    Lorenz, Dirk A.
    Manns, Paul
    Meyer, Christian
    APPLIED MATHEMATICS AND OPTIMIZATION, 2021, 83 (03): : 1919 - 1949