We propose a genetic-based expectation-maximization (GA-EM) algorithm for learning Gaussian mixture models from multivariate data. This algorithm is capable of selecting the number of components of the model using the minimum description length (MDL) criterion. We combine EM and GA into a single procedure. The population-based stochastic search of the GA explores the search space more thoroughly than the EM method. Therefore, our algorithm enables to escape from local optimal solutions since the algorithm becomes less sensitive to its initialization. The GA-EM algorithm is elitist which maintains the monotonic convergence property of the EM algorithm. The experiments show that the GA-EM outperforms the EM method since: (i) We have obtained a better MDL score while using exactly the same initialization and termination condition for both algorithms. (ii) Our approach identifies the number of components which were used to generate the underlying data more often as the EM algorithm.
机构:
Department of Information and Computational Sciences, School of Mathematical Sciences and LMAM, Peking UniversityDepartment of Information and Computational Sciences, School of Mathematical Sciences and LMAM, Peking University
Tao LI
Jinwen MA
论文数: 0引用数: 0
h-index: 0
机构:
Department of Information and Computational Sciences, School of Mathematical Sciences and LMAM, Peking UniversityDepartment of Information and Computational Sciences, School of Mathematical Sciences and LMAM, Peking University
机构:
Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R ChinaShantou Univ, Inst Math, Shantou 515063, Guangdong, Peoples R China
Ma, JW
Xu, L
论文数: 0引用数: 0
h-index: 0
机构:Shantou Univ, Inst Math, Shantou 515063, Guangdong, Peoples R China
Xu, L
Jordan, MI
论文数: 0引用数: 0
h-index: 0
机构:Shantou Univ, Inst Math, Shantou 515063, Guangdong, Peoples R China