Skew Gaussian mixture models for speaker recognition

被引：13

作者：

Matza, Avi ^{[1
]}

Bistritz, Yuval ^{[1
]}

机构：

[1] Tel Aviv Univ, Sch Elect Engn, IL-69978 Tel Aviv, Israel

来源：

IET SIGNAL PROCESSING | 2014年 / 8卷 / 08期

关键词：

Gaussian processes; mixture models; speaker recognition; vectors; expectation-maximisation algorithm; GMM; speech recognition; skew empirical distribution; expectation maximisation algorithm; EM algorithm; two-piece skew Gaussian mixture model; Mel frequency cepstral coefflcient; MFCC; line spectral frequency; LSF; immittance spectral frequency; ISF; speech transmission standard; feature vectors; DISTRIBUTIONS;

D O I：

10.1049/iet-spr.2013.0270

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Gaussian mixture models (GMMs) are widely used in speech and speaker recognition. This study explores the idea that a mixture of skew Gaussians might capture better feature vectors that tend to have skew empirical distributions. It begins with deriving an expectation maximisation (EM) algorithm to train a mixture of two-piece skew Gaussians that turns out to be not much more complicated than the usual EM algorithm used to train symmetric GMMs. Next, the algorithm is used to compare skew and symmetric GMMs in some simple speaker recognition experiments that use Mel frequency cepstral coefficients (MFCC) and line spectral frequencies (LSF) as the feature vectors. MFCC are one of the most popular feature vectors in speech and speaker recognition applications. LSF were chosen because they exhibit significantly more skewed distribution than MFCC and because they are widely used [together with the related immittance spectral frequencies (ISF)] in speech transmission standards. In the reported experiments, models with skew Gaussians performed better than models with symmetric Gaussians and skew GMMs with LSF compared favourably with both skew symmetric and symmetric GMMs that used MFCC.

引用

页码：860 / 867

页数：8

共 50 条

[11] Automatic speaker recognition using a unique personal feature vector and Gaussian Mixture Models
Kaminski, Kamil
Majda, Ewelina
Dobrowolski, Andrzej P.
2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 220 - 225
[12] Telephone based speaker recognition using multiple binary classifier and Gaussian Mixture Models
Castellano, PJ
Slomka, S
Sridharan, S
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1075 - 1078
[13] Improved Approach for Calculating Model Parameters in Speaker Recognition using Gaussian Mixture Models
Metkar, Prashant
Cohen, Aaron
Parhi, Keshab
2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 567 - 570
[14] Improved Gaussian Mixture Model and Application in Speaker Recognition
Bao Lingling
Shen Xizhong
PROCEEDINGS OF 2016 THE 2ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS, 2016, : 387 - 390
[15] Speaker verification using adapted Gaussian mixture models
Reynolds, DA
Quatieri, TF
Dunn, RB
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 19 - 41
[16] Large Margin Gaussian mixture models for speaker identification
Jourani, Reda
Daoudi, Khalid
Andre-Obrecht, Regine
Aboutajdine, Driss
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1441 - +
[17] SOFT FRAME MARGIN ESTIMATION OF GAUSSIAN MIXTURE MODELS FOR SPEAKER RECOGNITION WITH SPARSE TRAINING DATA
Yin, Yan
Li, Qi
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5268 - 5271
[18] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
Memon, Sheeraz
Bhatti, Sania
Abro, Farzana Rauf
MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
[19] Robust Text-independent Speaker recognition with Short Utterances using Gaussian Mixture Models
Chakroun, Rania
Frikha, Mondher
2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 2204 - 2209
[20] ACCURATE SPEAKER RECOGNITION BASED ON ADAPTIVE GAUSSIAN MIXTURE MODEL
Wang Yunqi
Yu Yibiao
2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 527 - 531

← 1 2 3 4 5 →