Automatic speaker recognition with crosslanguage speech material

被引:10
|
作者
Kuenzel, Hermann J. [1 ]
机构
[1] Univ Marburg, D-35032 Marburg, Germany
关键词
FORENSIC SPEAKER RECOGNITION; AUTOMATIC SPEAKER RECOGNITION; CROSS-LANGUAGE SPEECH MATERIAL; TRANSMISSION CHANNEL CHARACTERISTICS;
D O I
10.1558/ijsll.v20i1.21
中图分类号
DF [法律]; D9 [法律];
学科分类号
0301 ;
摘要
Automatic systems for forensic speaker recognition (FASR) claim to be largely independent of language based on the fact that feature vectors are composed of acoustic parameters that are derived from the resonance characteristics of vocal tract cavities. Yet a certain 'language gap' may remain which may deteriorate the performance of a system unless properly compensated. This forensic aspect of what may be called cross-language speaker recognition has not yet received due attention. Based on the most common forensic cross-language setting, the aim of this study was to assess the effect of language mismatch on the performance of a standard FASR system and compare its magnitude with the effect of other sources of mismatch on the same voice data. Using the automatic system Batvox 3 in an experiment with 75 bilingual speakers of seven languages and four kinds of transmission channels, it can be shown that, if speaker model and reference population are matched in terms of language, the remaining mismatch between speaker model and test sample can be neglected, since equal error rates (EERs) for same-language or cross-language comparisons are approximately the same, ranging from zero to 5.6%. Transmission of the speech data via landline telephone, GSM and, for part of the corpus, VoIP (using Skype) caused EERs to rise by less than 1% on average.
引用
收藏
页码:21 / 44
页数:24
相关论文
共 50 条
  • [1] ADAPTING TO THE SPEAKER IN AUTOMATIC SPEECH RECOGNITION
    TALBOT, M
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1987, 27 (04): : 449 - 457
  • [2] SIMILARITY MEASURE FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
    SCHROEDER, MR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1968, 43 (02): : 375 - +
  • [3] Deep Speaker Embedding for Speaker-Targeted Automatic Speech Recognition
    Chao, Guan-Lin
    Shen, John Paul
    Lane, Ian
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 39 - 43
  • [4] AUTOMATIC SPEAKER AUTHENTICATION USING SPEECH RECOGNITION TECHNIQUES
    MEEKER, WF
    MARTIN, TB
    HERSCHER, MB
    PHYFE, D
    WEINSTOCK, M
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1967, 42 (05): : 1182 - &
  • [5] Research on automatic speaker recognition based on speech clustering
    Xu, Limin
    Qian, Bo
    Cheng, Weiming
    Tang, Zhenmin
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 105 - +
  • [6] Correlation Networks for Speaker Normalization in Automatic Speech Recognition
    Sharon, Rini A.
    Kothinti, Sandeep Reddy
    Umesh, Srinivasan
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 882 - 886
  • [7] Forensic Automatic Speaker Recognition with Degraded and Enhanced Speech
    Kuenzel, Hermann
    Alexander, Paul
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2014, 62 (04): : 244 - 253
  • [8] Speaker-Invariant Features for Automatic Speech Recognition
    Umesh, S.
    Sanand, D. R.
    Praveen, G.
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1738 - 1743
  • [9] Overview of speech enhancement techniques for automatic speaker recognition
    OrtegaGarcia, J
    GonzalezRodriguez, J
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 929 - 932
  • [10] SPEECH RECOGNITION SYSTEM WITH AUTOMATIC SPEAKER-ADAPTION
    BROUWER, P
    FREQUENZ, 1978, 32 (07) : 204 - 207