Automatic speaker recognition with crosslanguage speech material

被引:10
|
作者
Kuenzel, Hermann J. [1 ]
机构
[1] Univ Marburg, D-35032 Marburg, Germany
关键词
FORENSIC SPEAKER RECOGNITION; AUTOMATIC SPEAKER RECOGNITION; CROSS-LANGUAGE SPEECH MATERIAL; TRANSMISSION CHANNEL CHARACTERISTICS;
D O I
10.1558/ijsll.v20i1.21
中图分类号
DF [法律]; D9 [法律];
学科分类号
0301 ;
摘要
Automatic systems for forensic speaker recognition (FASR) claim to be largely independent of language based on the fact that feature vectors are composed of acoustic parameters that are derived from the resonance characteristics of vocal tract cavities. Yet a certain 'language gap' may remain which may deteriorate the performance of a system unless properly compensated. This forensic aspect of what may be called cross-language speaker recognition has not yet received due attention. Based on the most common forensic cross-language setting, the aim of this study was to assess the effect of language mismatch on the performance of a standard FASR system and compare its magnitude with the effect of other sources of mismatch on the same voice data. Using the automatic system Batvox 3 in an experiment with 75 bilingual speakers of seven languages and four kinds of transmission channels, it can be shown that, if speaker model and reference population are matched in terms of language, the remaining mismatch between speaker model and test sample can be neglected, since equal error rates (EERs) for same-language or cross-language comparisons are approximately the same, ranging from zero to 5.6%. Transmission of the speech data via landline telephone, GSM and, for part of the corpus, VoIP (using Skype) caused EERs to rise by less than 1% on average.
引用
收藏
页码:21 / 44
页数:24
相关论文
共 50 条
  • [21] ON AUTOMATIC VOICE CASTING FOR EXPRESSIVE SPEECH: SPEAKER RECOGNITION VS. SPEECH CLASSIFICATION
    Obin, Nicolas
    Roebel, Axel
    Bachman, Gregoire
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [22] Studies on inter-speaker variability in speech and its application in automatic speech recognition
    S UMESH
    Sadhana, 2011, 36 : 853 - 883
  • [23] RODIGITS - A ROMANIAN CONNECTED-DIGITS SPEECH CORPUS FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
    Georgescu, Alexandru Lucian
    Caranica, Alexandru
    Cucu, Horia
    Burileanu, Corneliu
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2018, 80 (03): : 45 - 62
  • [24] Studies on inter-speaker variability in speech and its application in automatic speech recognition
    Umesh, S.
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 853 - 883
  • [25] Using speech synthesis to explain automatic speaker recognition: a new application of synthetic speech
    Brown, Georgina
    Kirchhubel, Christin
    Cuthbert, Ramiz
    INTERSPEECH 2023, 2023, : 4723 - 4727
  • [26] Automatic speech/speaker recognition in noisy environments using wavelet transform
    Alkhaldi, W
    Fakhr, W
    Hamdy, N
    2002 45TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL I, CONFERENCE PROCEEDINGS, 2002, : 463 - 466
  • [27] Evaluation of a forensic automatic speaker recognition system with emotional speech recordings
    Essery, Robert
    Harrison, Philip
    Hughes, Vincent
    INTERSPEECH 2023, 2023, : 2568 - 2572
  • [28] Fast speaker adaptation of artificial neural networks for automatic speech recognition
    Dupont, S
    Cheboub, L
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1795 - 1798
  • [29] An automatic speech recognition system with speaker-independent identification support
    Caranica, Alexandru
    Burileanu, Corneliu
    ADVANCED TOPICS IN OPTOELECTRONICS, MICROELECTRONICS, AND NANOTECHNOLOGIES VII, 2015, 9258
  • [30] ROBUSTNESS TO SPEAKER POSITION IN DISTANT-TALKING AUTOMATIC SPEECH RECOGNITION
    Gomez, Randy
    Nakamura, Keisuke
    Nakadai, Kazuhiro
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7034 - 7038