Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport

被引:0
|
作者
Zhang, Ruiteng [1 ]
Wei, Jianguo [1 ,2 ]
Lu, Xugang [3 ]
Lu, Wenhuan [1 ]
Jin, Di [1 ]
Zhang, Lin [4 ]
Xu, Junhai [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[2] Qinghai Nationalities Univ, Comp Coll, Xining 810007, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
[4] Brno Univ Technol, Brno 61266, Czech Republic
基金
中国国家自然科学基金;
关键词
Speaker recognition; unsupervised domain adaptation; optimal transport; coupling regularization; DOMAIN ADAPTATION; NEURAL-NETWORKS;
D O I
10.1109/TASLP.2024.3426934
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cross-domain speaker recognition (SR) can be improved by unsupervised domain adaptation (UDA) algorithms. UDA algorithms often reduce domain mismatch at the cost of decreasing the discrimination of speaker features. In contrast, optimal transport (OT) has the potential to achieve domain alignment while preserving the speaker discrimination capability in UDA applications; however, naively applying OT to measure global probability distribution discrepancies between the source and target domains may induce negative transports where samples belonging to different speakers are coupled in transportation. These negative transports reduce the SR model's discriminative power, degrading the SR performance. This paper proposes a coupling-regularized optimal transport (CROT) algorithm for cross-domain SR to reduce the negative transport during UDA. In the proposed CROT, two consecutive processing modules regularize the coupling paths for the OT solution: a progressive inter-speaker constraint (PISC) module and a coupling-smoothed regularization (CSR) module. The PISC, designed as a pseudo-label memory bank with curriculum learning, is first applied to select valid samples to guarantee that coupling samples are from the same speaker. The CSR, designed to control the information entropy of the coupling paths further, reduces the effect of negative transport in UDA. To evaluate the effectiveness of the proposed algorithm, cross-domain SR experiments were conducted under different target domains, speaker encoders, corpora, and acoustic features. Experimental results showed that CROT achieved a 50% relative reduction in equal error rates compared to conventional OT-based UDAs, outperforming the state-of-the-art UDAs.
引用
收藏
页码:3603 / 3617
页数:15
相关论文
共 50 条
  • [31] A simulation study on optimal scores for speaker recognition
    Dong Wang
    EURASIP Journal on Audio, Speech, and Music Processing, 2020
  • [32] Tsallis Regularized Optimal Transport and Ecological Inference
    Muzellec, Boris
    Nock, Richard
    Patrini, Giorgio
    Nielsen, Frank
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2387 - 2393
  • [33] Improved Optimization Methods for Regularized Optimal Transport
    Cui, Shaobo
    Song, Chaobing
    Jiang, Yong
    2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 695 - 700
  • [34] Screening Sinkhorn Algorithm for Regularized Optimal Transport
    Alaya, Mokhtar Z.
    Berar, Maxime
    Gasso, Gilles
    Rakotomamonjy, Alain
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [35] On the Efficiency of Entropic Regularized Algorithms for Optimal Transport
    Lin, Tianyi
    Ho, Nhat
    Jordan, Michael I.
    Journal of Machine Learning Research, 2022, 23
  • [36] Domain decomposition for entropy regularized optimal transport
    Mauro Bonafini
    Bernhard Schmitzer
    Numerische Mathematik, 2021, 149 : 819 - 870
  • [37] Domain decomposition for entropy regularized optimal transport
    Bonafini, Mauro
    Schmitzer, Bernhard
    NUMERISCHE MATHEMATIK, 2021, 149 (04) : 819 - 870
  • [38] On the Efficiency of Entropic Regularized Algorithms for Optimal Transport
    Lin, Tianyi
    Ho, Nhat
    Jordan, Michael I.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [39] Entropy Regularized Optimal Transport Independence Criterion
    Liu, Lang
    Pal, Soumik
    Harchaoui, Zaid
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [40] Adaptive optimal transport
    Essid, Montacer
    Laefer, Debra F.
    Tabak, Esteban G.
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2019, 8 (04) : 789 - 816