Skew Gaussian mixture models for speaker recognition

被引:13
|
作者
Matza, Avi [1 ]
Bistritz, Yuval [1 ]
机构
[1] Tel Aviv Univ, Sch Elect Engn, IL-69978 Tel Aviv, Israel
关键词
Gaussian processes; mixture models; speaker recognition; vectors; expectation-maximisation algorithm; GMM; speech recognition; skew empirical distribution; expectation maximisation algorithm; EM algorithm; two-piece skew Gaussian mixture model; Mel frequency cepstral coefflcient; MFCC; line spectral frequency; LSF; immittance spectral frequency; ISF; speech transmission standard; feature vectors; DISTRIBUTIONS;
D O I
10.1049/iet-spr.2013.0270
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Gaussian mixture models (GMMs) are widely used in speech and speaker recognition. This study explores the idea that a mixture of skew Gaussians might capture better feature vectors that tend to have skew empirical distributions. It begins with deriving an expectation maximisation (EM) algorithm to train a mixture of two-piece skew Gaussians that turns out to be not much more complicated than the usual EM algorithm used to train symmetric GMMs. Next, the algorithm is used to compare skew and symmetric GMMs in some simple speaker recognition experiments that use Mel frequency cepstral coefficients (MFCC) and line spectral frequencies (LSF) as the feature vectors. MFCC are one of the most popular feature vectors in speech and speaker recognition applications. LSF were chosen because they exhibit significantly more skewed distribution than MFCC and because they are widely used [together with the related immittance spectral frequencies (ISF)] in speech transmission standards. In the reported experiments, models with skew Gaussians performed better than models with symmetric Gaussians and skew GMMs with LSF compared favourably with both skew symmetric and symmetric GMMs that used MFCC.
引用
收藏
页码:860 / 867
页数:8
相关论文
共 50 条
  • [1] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 12 - 15
  • [2] Speaker recognition using Gaussian mixture models
    Kamarauskas, J.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2008, (05) : 29 - 32
  • [3] Automatic Speaker Recognition Using Gaussian Mixture Speaker Models
    Reynolds, D. A.
    Lincoln Laboratory Journal, 8 (02):
  • [4] EMPLOYMENT OF SUBSPACE GAUSSIAN MIXTURE MODELS IN SPEAKER RECOGNITION
    Motlicek, Petr
    Dey, Subhadeep
    Madikeri, Srikanth
    Burget, Lukas
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4445 - 4449
  • [5] Combining Gaussian mixture models and segmental feature models for speaker recognition
    Milosevic, Milana
    Glavitsch, Ulrike
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2042 - 2043
  • [6] Speaker recognition for VoIP transmission using Gaussian mixture models
    Staroniewicz, P
    COMPUTER RECOGNITION SYSTEMS, PROCEEDINGS, 2005, : 739 - 745
  • [7] α-Gaussian mixture modelling for speaker recognition
    Wu, Dalei
    Li, Ji
    Wu, Haiqing
    PATTERN RECOGNITION LETTERS, 2009, 30 (06) : 589 - 594
  • [8] Application of Differential Evolution Optimization based Gaussian Mixture Models to Speaker Recognition
    Zhou Hong
    Zhang JianHua
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 4297 - 4302
  • [9] SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 91 - 108
  • [10] Score calibrating for speaker recognition based on support vector machines and Gaussian Mixture Models
    Katz, Marcel
    Schaffoener, Martin
    Krueger, Sven E.
    Wendemuth, Andreas
    PROCEEDINGS OF THE NINTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2007, : 146 - 151