Improved phonetic speaker recognition using lattice decoding

被引：0

作者：

Hatch, AO ^{[1
]}

Peskin, B ^{[1
]}

Stolcke, A ^{[1
]}

机构：

[1] Int Comp Sci Inst, Berkeley, CA USA

来源：

2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current "state-of-the-art" in phonetic speaker recognition uses relative frequencies of phone n-grams as features for training speaker models and for scoring test-target pairs. Typically, these relative frequencies are computed from a simple 1-best phone decoding of the input speech. In this paper, we present results on the Switchboard-2 corpus, where we compare 1-best phone decodings versus lattice phone decodings for the purposes of performing phonetic speaker recognition. The phone decodings are used to compute relative frequencies of phone bigrams, which are then used as inputs for two standard phonetic speaker recognition systems: a system based on log-likelihood ratios (LLRs) [1, 2], and a system based on support vector machines (SVMs) [3]. In each experiment, the lattice phone decodings achieve relative reductions in equal-error rate (EER) of between 31% and 66% below the EERs of the 1-best phone decodings. Our best phonetic system achieves an EER of 2.0% on 8-conversation training and 1.4% when combined with a GMM-based system.

引用

页码：169 / 172

页数：4

共 50 条

[31] Acoustic Phonetic Decoding Oriented to Multilingual Speech Recognition in the Basque Context
Barroso, N.
Lopez de Ipina, K.
Ezeiza, A.
TRENDS IN PRACTICAL APPLICATIONS OF AGENTS AND MULTIAGENT SYSTEMS, 2010, 71 : 697 - +
[32] Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives
Cerva, Petr
Silovsky, Jan
Zdansky, Jindrich
Nouza, Jan
Seps, Ladislav
SPEECH COMMUNICATION, 2013, 55 (10) : 1033 - 1046
[33] PHONETIC ANALYSIS OF SPEAKER RECOGNITION BY LINGUISTICALLY NAIVE INDIVIDUALS - GERMAN - KUNZEL,HJ
NOLAN, F
JOURNAL OF PHONETICS, 1992, 20 (01) : 176 - 178
[34] Acoustic-phonetic speech parameters for speaker-independent speech recognition
Deshmukh, O
Espy-Wilson, CY
Juneja, A
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 593 - 596
[35] REPRESENTATION OF ACOUSTIC AND PHONETIC KNOWLEDGE FOR SPEAKER-INDEPENDENT RECOGNITION OF SMALL VOCABULARIES
MELONI, H
GILLES, P
BETARI, A
SPEECH COMMUNICATION, 1991, 10 (02) : 145 - 154
[36] An Acoustic-Phonetic-Based Speaker Adaptation Technique for Improving Speaker-Independent Continuous Speech Recognition
Zhao, Yunxin
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (03): : 380 - 394
[37] Error correction for speaker-independent isolated word recognition through likelihood compensation using phonetic bigram
Matsuo, H
Ishigame, M
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 701 - 704
[38] SPOKEN TERM DETECTION USING FAST PHONETIC DECODING
Wallace, Roy
Vogt, Robbie
Sridharan, Sridha
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4881 - 4884
[39] Automatic Recognition of Speaker Physical Load using Posterior Probability Based Features from Acoustic and Phonetic Tokens
Li, Ming
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 437 - 441
[40] Deep Learning of Speech Features for Improved Phonetic Recognition
Lee, Jaehyung
Lee, Soo-Young
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259

← 1 2 3 4 5 →