Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech

被引:2
|
作者
Yoma, Nestor Becerra [1 ]
Garreton, Claudio [1 ]
Molina, Carlos [1 ]
Huenupan, Fernando [1 ]
机构
[1] Univ Chile, Speech Proc & Transmiss Lab, Dept Elect Engn, Santiago, Chile
关键词
Text-dependent speaker verification; Feature compensation; Intra-speaker variability; Unsupervised model adaptation; Gestalt; Telephone speech; Limited enrolling data; Noise robustness; Speaker verification database in Spanish;
D O I
10.1016/j.specom.2007.11.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, an unsupervised intra-speaker variability compensation (ISVC) method based oil Gestalt is proposed to address the problem of limited enrolling data and noise robustness in text-dependent speaker verification (SV). Experiments with two databases show that: ISVC can lead to reductions in EER as high as 20% or 40% and ISCV provides reductions in the integral below the ROC curve between 30%, and 60%. Also, the observed improvements are independent of the number of enrolling utterances. In contrast to model adaptation methods, ISVC is memoryless with respect to previous verification attempts. As shown here, unsupervised model adaptation can lead to substantial improvements in EER but is highly dependent oil the sequence of client/impostor verification events. In adverse scenarios, such its massive impostor attacks and verification from alternated telephone line, unsupervised model adaptation might even provide reductions in verification accuracy when compared with the baseline system. In those cases, ISVC can even outperform adaptation schemes. It is worth emphasizing that ISVC and unsupervised model adaptation are compatible and the combination of both methods always improves the performance of model adaptation. The combination of both schemes can lead to improvements in EER its high its 34%. Due to the restrictions of commercially available databases for text-dependent SV research, the results presented here are based oil local databases in Spanish. By doing so, the visibility of research in Iberian Languages is highlighted. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:953 / 964
页数:12
相关论文
共 50 条
  • [41] An intra-speaker factor estimation based on pitch alteration utterance
    Yu, IS
    Hong, KS
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IV, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING, 2003, : 456 - 460
  • [42] Contrastive Learning and Inter-Speaker Distribution Alignment Based Unsupervised Domain Adaptation for Robust Speaker Verification
    Li, Zuoliang
    Guo, Wu
    Bin Gu
    Peng, Shengyu
    Zhang, Jie
    INTERSPEECH 2024, 2024, : 3794 - 3798
  • [43] Speaker and session variability in GMM-based speaker verification
    Kenny, Patrick
    Boulianne, Gilles
    Ouellet, Pierre
    Dumouchel, Pierre
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1448 - 1460
  • [44] A Study of Intra-Speaker and Inter-Speaker Affective Variability using Electroglottograph and Inverse Filtered Glottal Waveforms
    Bone, Daniel
    Kim, Samuel
    Lee, Sungbok
    Narayanan, Shrikanth S.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 913 - 916
  • [45] A Comparison of Session Variability Compensation Approaches for Speaker Verification
    McLaren, Mitchell
    Vogt, Robert
    Baker, Brendan
    Sridharan, Sridha
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2010, 5 (04) : 802 - 809
  • [46] Robust Session Variability Compensation for SVM Speaker Verification
    Seo, Hyunson
    Jung, Chi-Sang
    Kang, Hong-Goo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1631 - 1641
  • [47] Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification
    Wang, Qiongqiong
    Koshinaka, Takafumi
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3727 - 3731
  • [48] EDITnet: A Lightweight Network for Unsupervised Domain Adaptation in Speaker Verification
    Li, Jingyu
    Liu, Wei
    Lee, Tan
    INTERSPEECH 2022, 2022, : 3694 - 3698
  • [49] Unsupervised speaker adaptation for robust speech recognition in real environments
    Yamade, S
    Baba, A
    Yoshikawa, S
    Lee, A
    Saruwatari, H
    Shikano, K
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (08): : 30 - 41
  • [50] Speech Enhancement Regularized by a Speaker Verification Model
    Lay, Bunlong
    Gerkmann, Timo
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,