Skew Gaussian mixture models for speaker recognition

被引：13

作者：

Matza, Avi ^{[1
]}

Bistritz, Yuval ^{[1
]}

机构：

[1] Tel Aviv Univ, Sch Elect Engn, IL-69978 Tel Aviv, Israel

来源：

IET SIGNAL PROCESSING | 2014年 / 8卷 / 08期

关键词：

Gaussian processes; mixture models; speaker recognition; vectors; expectation-maximisation algorithm; GMM; speech recognition; skew empirical distribution; expectation maximisation algorithm; EM algorithm; two-piece skew Gaussian mixture model; Mel frequency cepstral coefflcient; MFCC; line spectral frequency; LSF; immittance spectral frequency; ISF; speech transmission standard; feature vectors; DISTRIBUTIONS;

D O I：

10.1049/iet-spr.2013.0270

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Gaussian mixture models (GMMs) are widely used in speech and speaker recognition. This study explores the idea that a mixture of skew Gaussians might capture better feature vectors that tend to have skew empirical distributions. It begins with deriving an expectation maximisation (EM) algorithm to train a mixture of two-piece skew Gaussians that turns out to be not much more complicated than the usual EM algorithm used to train symmetric GMMs. Next, the algorithm is used to compare skew and symmetric GMMs in some simple speaker recognition experiments that use Mel frequency cepstral coefficients (MFCC) and line spectral frequencies (LSF) as the feature vectors. MFCC are one of the most popular feature vectors in speech and speaker recognition applications. LSF were chosen because they exhibit significantly more skewed distribution than MFCC and because they are widely used [together with the related immittance spectral frequencies (ISF)] in speech transmission standards. In the reported experiments, models with skew Gaussians performed better than models with symmetric Gaussians and skew GMMs with LSF compared favourably with both skew symmetric and symmetric GMMs that used MFCC.

引用

页码：860 / 867

页数：8

共 50 条

[1] Skew Gaussian mixture models for speaker recognition
Matza, Avi
Bistritz, Yuval
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
[2] Speaker recognition using Gaussian mixture models
Kamarauskas, J.
ELEKTRONIKA IR ELEKTROTECHNIKA, 2008, (05) : 29 - 32
[3] Automatic Speaker Recognition Using Gaussian Mixture Speaker Models
Reynolds, D. A.
Lincoln Laboratory Journal, 8 (02):
[4] EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION
Motlicek, Petr
Dey, Subhadeep
Madikeri, Srikanth
Burget, Lukas
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4445 - 4449
[5] Combining Gaussian mixture models and segmental feature models for speaker recognition
Milosevic, Milana
Glavitsch, Ulrike
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2042 - 2043
[6] Speaker recognition for VoIP transmission using Gaussian mixture models
Staroniewicz, P
COMPUTER RECOGNITION SYSTEMS, PROCEEDINGS, 2005, : 739 - 745
[7] α-Gaussian mixture modelling for speaker recognition
Wu, Dalei
Li, Ji
Wu, Haiqing
PATTERN RECOGNITION LETTERS, 2009, 30 (06) : 589 - 594
[8] Application of Differential Evolution Optimization based Gaussian Mixture Models to Speaker Recognition
Zhou Hong
Zhang JianHua
26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 4297 - 4302
[9] SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
REYNOLDS, DA
SPEECH COMMUNICATION, 1995, 17 (1-2) : 91 - 108
[10] Score calibrating for speaker recognition based on support vector machines and Gaussian Mixture Models
Katz, Marcel
Schaffoener, Martin
Krueger, Sven E.
Wendemuth, Andreas
PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2007, : 146 - 151

← 1 2 3 4 5 →