Improved phonetic speaker recognition using lattice decoding

被引:0
|
作者
Hatch, AO [1 ]
Peskin, B [1 ]
Stolcke, A [1 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current "state-of-the-art" in phonetic speaker recognition uses relative frequencies of phone n-grams as features for training speaker models and for scoring test-target pairs. Typically, these relative frequencies are computed from a simple 1-best phone decoding of the input speech. In this paper, we present results on the Switchboard-2 corpus, where we compare 1-best phone decodings versus lattice phone decodings for the purposes of performing phonetic speaker recognition. The phone decodings are used to compute relative frequencies of phone bigrams, which are then used as inputs for two standard phonetic speaker recognition systems: a system based on log-likelihood ratios (LLRs) [1, 2], and a system based on support vector machines (SVMs) [3]. In each experiment, the lattice phone decodings achieve relative reductions in equal-error rate (EER) of between 31% and 66% below the EERs of the 1-best phone decodings. Our best phonetic system achieves an EER of 2.0% on 8-conversation training and 1.4% when combined with a GMM-based system.
引用
收藏
页码:169 / 172
页数:4
相关论文
共 50 条
  • [1] Phonetic speaker recognition
    Kohler, MA
    Andrews, WD
    Campbell, JP
    Hernández-Cordero, J
    CONFERENCE RECORD OF THE THIRTY-FIFTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2001, : 1557 - 1561
  • [2] THE PHONETIC BASES OF SPEAKER RECOGNITION - NOLAN,F
    HOLLIEN, H
    CONTEMPORARY PSYCHOLOGY, 1985, 30 (10): : 801 - 802
  • [3] Phonetic speaker recognition with support vector machines
    Campbell, WM
    Campbell, JP
    Reynolds, DA
    Jones, DA
    Leek, TR
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1377 - 1384
  • [4] THE PHONETIC BASES OF SPEAKER RECOGNITION - NOLAN,F
    SCHOENTGEN, J
    SPEECH COMMUNICATION, 1987, 6 (02) : 171 - 175
  • [5] Speaker independent bimodal phonetic recognition experiments
    Cosi, P
    Caldognetto, EM
    Ferrero, F
    Dugatto, M
    Vagges, K
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 54 - 57
  • [6] Speaker recognition via nonlinear phonetic and speaker-discriminative features
    Stoll, Lara
    Frankel, Joe
    Mirghafori, Nikki
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 114 - 123
  • [7] Gender-dependent phonetic refraction for speaker recognition
    Andrews, WD
    Kohler, MA
    Campbell, JP
    Godfrey, JJ
    Hernández-Cordero, J
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 149 - 152
  • [8] Speaker adaptation techniques for speech recognition with a speaker-independent phonetic recognizer
    Kim, WG
    Jang, M
    COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 95 - 100
  • [9] Employing Phonetic Information in DNN Speaker Embeddings to Improve Speaker Recognition Performance
    Rahman, Md Hafizur
    Himawan, Ivan
    Mclaren, Mitchell
    Fookes, Clinton
    Sridharan, Sridha
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3593 - 3597
  • [10] IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES
    McLaren, Mitchell
    Lei, Yun
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4430 - 4434