Unsupervised intra-speaker variability compensation based on Gestalt and model adaptation in speaker verification with telephone speech

被引:2
|
作者
Yoma, Nestor Becerra [1 ]
Garreton, Claudio [1 ]
Molina, Carlos [1 ]
Huenupan, Fernando [1 ]
机构
[1] Univ Chile, Speech Proc & Transmiss Lab, Dept Elect Engn, Santiago, Chile
关键词
Text-dependent speaker verification; Feature compensation; Intra-speaker variability; Unsupervised model adaptation; Gestalt; Telephone speech; Limited enrolling data; Noise robustness; Speaker verification database in Spanish;
D O I
10.1016/j.specom.2007.11.005
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, an unsupervised intra-speaker variability compensation (ISVC) method based oil Gestalt is proposed to address the problem of limited enrolling data and noise robustness in text-dependent speaker verification (SV). Experiments with two databases show that: ISVC can lead to reductions in EER as high as 20% or 40% and ISCV provides reductions in the integral below the ROC curve between 30%, and 60%. Also, the observed improvements are independent of the number of enrolling utterances. In contrast to model adaptation methods, ISVC is memoryless with respect to previous verification attempts. As shown here, unsupervised model adaptation can lead to substantial improvements in EER but is highly dependent oil the sequence of client/impostor verification events. In adverse scenarios, such its massive impostor attacks and verification from alternated telephone line, unsupervised model adaptation might even provide reductions in verification accuracy when compared with the baseline system. In those cases, ISVC can even outperform adaptation schemes. It is worth emphasizing that ISVC and unsupervised model adaptation are compatible and the combination of both methods always improves the performance of model adaptation. The combination of both schemes can lead to improvements in EER its high its 34%. Due to the restrictions of commercially available databases for text-dependent SV research, the results presented here are based oil local databases in Spanish. By doing so, the visibility of research in Iberian Languages is highlighted. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:953 / 964
页数:12
相关论文
共 50 条
  • [21] Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation
    Chien, JT
    Wang, HC
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (03): : 129 - 135
  • [22] Session variability subspace projection based model compensation for speaker verification
    Deng, Jing
    Zheng, Thomas Fang
    Wu, Wenhu
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 57 - +
  • [23] Study of intra-speaker's speech variability over long and short time periods for speech recognition
    Tsuge, Satoru
    Shishibori, Masami
    Kita, Kenji
    Ren, Fuji
    Kuroiwa, Shingo
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 397 - 400
  • [24] Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 146 (01): : 316 - 329
  • [25] A continuous unsupervised adaptation method for speaker verification
    Preti, Alexandre
    Bonastre, Jean-Francois
    Capnian, Francois
    INNOVATIONS IN E-LEARNING, INSTRUCTION TECHNOLOGY, ASSESSMENT, AND ENGINEERING EDUCATION, 2007, : 461 - 465
  • [26] An amplitude warping approach to intra-speaker normalization for speech recognition
    Hong, KS
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 2, PROCEEDINGS, 2003, 2668 : 639 - 645
  • [27] Unsupervised Bayesian Adaptation of PLDA for Speaker Verification
    Borgstrorn, Bengt J.
    INTERSPEECH 2021, 2021, : 1039 - 1043
  • [28] Speaker verification with model-based and score-based unsupervised adaptation method
    Wang, Er-Yu
    Guo, Wu
    Li, Yi-Jie
    Dai, Li-Rong
    Wang, Ren-Hua
    Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (03): : 267 - 271
  • [29] Timbre features with MEDIAN values for compensating intra-speaker variability in speaker identification of whispering sound
    Sardar V.M.
    Jadhav M.L.
    Deshmukh S.H.
    International Journal of Speech Technology, 2022, 25 (03): : 773 - 782
  • [30] Variable pronunciations reveal dynamic intra-speaker variation in speech planning
    Oriana Kilbourn-Ceron
    Matthew Goldrick
    Psychonomic Bulletin & Review, 2021, 28 : 1365 - 1380