Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport

被引:0
|
作者
Zhang, Ruiteng [1 ]
Wei, Jianguo [1 ,2 ]
Lu, Xugang [3 ]
Lu, Wenhuan [1 ]
Jin, Di [1 ]
Zhang, Lin [4 ]
Xu, Junhai [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[2] Qinghai Nationalities Univ, Comp Coll, Xining 810007, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
[4] Brno Univ Technol, Brno 61266, Czech Republic
基金
中国国家自然科学基金;
关键词
Speaker recognition; unsupervised domain adaptation; optimal transport; coupling regularization; DOMAIN ADAPTATION; NEURAL-NETWORKS;
D O I
10.1109/TASLP.2024.3426934
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cross-domain speaker recognition (SR) can be improved by unsupervised domain adaptation (UDA) algorithms. UDA algorithms often reduce domain mismatch at the cost of decreasing the discrimination of speaker features. In contrast, optimal transport (OT) has the potential to achieve domain alignment while preserving the speaker discrimination capability in UDA applications; however, naively applying OT to measure global probability distribution discrepancies between the source and target domains may induce negative transports where samples belonging to different speakers are coupled in transportation. These negative transports reduce the SR model's discriminative power, degrading the SR performance. This paper proposes a coupling-regularized optimal transport (CROT) algorithm for cross-domain SR to reduce the negative transport during UDA. In the proposed CROT, two consecutive processing modules regularize the coupling paths for the OT solution: a progressive inter-speaker constraint (PISC) module and a coupling-smoothed regularization (CSR) module. The PISC, designed as a pseudo-label memory bank with curriculum learning, is first applied to select valid samples to guarantee that coupling samples are from the same speaker. The CSR, designed to control the information entropy of the coupling paths further, reduces the effect of negative transport in UDA. To evaluate the effectiveness of the proposed algorithm, cross-domain SR experiments were conducted under different target domains, speaker encoders, corpora, and acoustic features. Experimental results showed that CROT achieved a 50% relative reduction in equal error rates compared to conventional OT-based UDAs, outperforming the state-of-the-art UDAs.
引用
收藏
页码:3603 / 3617
页数:15
相关论文
共 50 条
  • [41] A Simple Unsupervised Knowledge-Free Domain Adaptation for Speaker Recognition
    Lin, Wan
    Li, Lantian
    Wang, Dong
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [42] N-Best-based unsupervised speaker adaptation for speech recognition
    Matsui, T
    Furui, S
    COMPUTER SPEECH AND LANGUAGE, 1998, 12 (01): : 41 - 50
  • [43] Automatic speech recognition fusion approach to unsupervised speaker clustering and labeling
    Lawson, A. D.
    Huggins, M. C.
    Grieco, J. J.
    Galligan, S. A.
    Harris, D. M.
    2006 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2006, : 3280 - 3285
  • [44] UNSUPERVISED DOMAIN ADAPTATION VIA DOMAIN ADVERSARIAL TRAINING FOR SPEAKER RECOGNITION
    Wang, Qing
    Rao, Wei
    Sun, Sining
    Xie, Lei
    Chng, Eng Siong
    Li, Haizhou
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4889 - 4893
  • [45] THE CORAL plus plus ALGORITHM FOR UNSUPERVISED DOMAIN ADAPTATION OF SPEAKER RECOGNITION
    Li, Rongjin
    Zhang, Weibin
    Chen, Dongpeng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7172 - 7176
  • [46] Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription
    Silovsky, Jan
    Cerva, Petr
    Zdansky, Jindrich
    Nouza, Jan
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 478 - 481
  • [47] Optimal, unsupervised learning in invariant object recognition
    Wallis, G
    Baddeley, R
    NEURAL COMPUTATION, 1997, 9 (04) : 883 - 894
  • [48] Regularized Optimal Transport Based on an Adaptive Adjustment Method for Selecting the Scaling Parameters of Unscented Kalman Filters
    Kang, Chang Ho
    Kim, Sun Young
    SENSORS, 2022, 22 (03)
  • [49] GlymphVIS: Visualizing Glymphatic Transport Pathways Using Regularized Optimal Transport
    Elkin, Rena
    Nadeem, Saad
    Haber, Eldad
    Steklova, Klara
    Lee, Hedok
    Benveniste, Helene
    Tannenbaum, Allen
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2018, PT I, 2018, 11070 : 844 - 852
  • [50] Hierarchical optimal transport for unsupervised domain adaptation
    Mourad El Hamri
    Younès Bennani
    Issam Falih
    Machine Learning, 2022, 111 : 4159 - 4182