Skew Gaussian mixture models for speaker recognition

被引:13
|
作者
Matza, Avi [1 ]
Bistritz, Yuval [1 ]
机构
[1] Tel Aviv Univ, Sch Elect Engn, IL-69978 Tel Aviv, Israel
关键词
Gaussian processes; mixture models; speaker recognition; vectors; expectation-maximisation algorithm; GMM; speech recognition; skew empirical distribution; expectation maximisation algorithm; EM algorithm; two-piece skew Gaussian mixture model; Mel frequency cepstral coefflcient; MFCC; line spectral frequency; LSF; immittance spectral frequency; ISF; speech transmission standard; feature vectors; DISTRIBUTIONS;
D O I
10.1049/iet-spr.2013.0270
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Gaussian mixture models (GMMs) are widely used in speech and speaker recognition. This study explores the idea that a mixture of skew Gaussians might capture better feature vectors that tend to have skew empirical distributions. It begins with deriving an expectation maximisation (EM) algorithm to train a mixture of two-piece skew Gaussians that turns out to be not much more complicated than the usual EM algorithm used to train symmetric GMMs. Next, the algorithm is used to compare skew and symmetric GMMs in some simple speaker recognition experiments that use Mel frequency cepstral coefficients (MFCC) and line spectral frequencies (LSF) as the feature vectors. MFCC are one of the most popular feature vectors in speech and speaker recognition applications. LSF were chosen because they exhibit significantly more skewed distribution than MFCC and because they are widely used [together with the related immittance spectral frequencies (ISF)] in speech transmission standards. In the reported experiments, models with skew Gaussians performed better than models with symmetric Gaussians and skew GMMs with LSF compared favourably with both skew symmetric and symmetric GMMs that used MFCC.
引用
收藏
页码:860 / 867
页数:8
相关论文
共 50 条
  • [11] Automatic speaker recognition using a unique personal feature vector and Gaussian Mixture Models
    Kaminski, Kamil
    Majda, Ewelina
    Dobrowolski, Andrzej P.
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 220 - 225
  • [12] Telephone based speaker recognition using multiple binary classifier and Gaussian Mixture Models
    Castellano, PJ
    Slomka, S
    Sridharan, S
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1075 - 1078
  • [13] Improved Approach for Calculating Model Parameters in Speaker Recognition using Gaussian Mixture Models
    Metkar, Prashant
    Cohen, Aaron
    Parhi, Keshab
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 567 - 570
  • [14] Improved Gaussian Mixture Model and Application in Speaker Recognition
    Bao Lingling
    Shen Xizhong
    PROCEEDINGS OF 2016 THE 2ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS, 2016, : 387 - 390
  • [15] Speaker verification using adapted Gaussian mixture models
    Reynolds, DA
    Quatieri, TF
    Dunn, RB
    DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 19 - 41
  • [16] Large Margin Gaussian mixture models for speaker identification
    Jourani, Reda
    Daoudi, Khalid
    Andre-Obrecht, Regine
    Aboutajdine, Driss
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1441 - +
  • [17] SOFT FRAME MARGIN ESTIMATION OF GAUSSIAN MIXTURE MODELS FOR SPEAKER RECOGNITION WITH SPARSE TRAINING DATA
    Yin, Yan
    Li, Qi
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5268 - 5271
  • [18] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Memon, Sheeraz
    Bhatti, Sania
    Abro, Farzana Rauf
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
  • [19] Robust Text-independent Speaker recognition with Short Utterances using Gaussian Mixture Models
    Chakroun, Rania
    Frikha, Mondher
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 2204 - 2209
  • [20] ACCURATE SPEAKER RECOGNITION BASED ON ADAPTIVE GAUSSIAN MIXTURE MODEL
    Wang Yunqi
    Yu Yibiao
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 527 - 531