Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

被引:0
|
作者
Nguyen, Huy [1 ]
Nguyen, TrungTin [2 ,3 ]
Nguyen, Khai [1 ]
Ho, Nhat [1 ]
机构
[1] Univ Texas Austin, Dept Stat & Data Sci, Austin, TX 78712 USA
[2] Univ Queensland, Sch Math & Phys, Brisbane, Qld, Australia
[3] Univ Grenoble Alpes, Inria, LJK, Grenoble INP,CNRS, F-38000 Grenoble, France
关键词
MAXIMUM-LIKELIHOOD; OF-EXPERTS; HIERARCHICAL MIXTURES; FEATURE-SELECTION; IDENTIFIABILITY; REGRESSION; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applications of machine learning and statistics. Despite its popularity in practice, a satisfactory level of theoretical understanding of the MoE model is far from complete. To shed new light on this problem, we provide a convergence analysis for maximum likelihood estimation (MLE) in the Gaussian-gated MoE model. The main challenge of that analysis comes from the inclusion of covariates in the Gaussian gating functions and expert networks, which leads to their intrinsic interaction via some partial differential equations with respect to their parameters. We tackle these issues by designing novel Voronoi loss functions among parameters to accurately capture the heterogeneity of parameter estimation rates. Our findings reveal that the MLE has distinct behaviors under two complement settings of location parameters of the Gaussian gating functions, namely when all these parameters are non-zero versus when at least one among them vanishes. Notably, these behaviors can be characterized by the solvability of two different systems of polynomial equations. Finally, we conduct a simulation study to empirically verify our theoretical results.
引用
收藏
页数:33
相关论文
共 50 条
  • [41] CONVERGENCE-RATES FOR EMPIRICAL BAYES ESTIMATION OF THE SCALE PARAMETER IN A PARETO DISTRIBUTION
    LIANG, TC
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1993, 16 (01) : 35 - 45
  • [42] A Gaussian mixture model based cost function for parameter estimation of chaotic biological systems
    Shekofteh, Yasser
    Jafari, Sajad
    Sprott, Julien Clinton
    Golpayegani, S. Mohammad Reza Hashemi
    Almasganj, Farshad
    COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2015, 20 (02) : 469 - 481
  • [43] Improved methods for parameter estimation of mixture Gaussian model using genetic and maximum likelihood algorithms
    Nasab, NM
    Analoui, M
    Delp, EJ
    MEDICAL IMAGING 2004: IMAGE PROCESSING, PTS 1-3, 2004, 5370 : 566 - 576
  • [44] Accelerated distributed expectation-maximization algorithms for the parameter estimation in multivariate Gaussian mixture models
    Guo, Guangbao
    Wang, Qian
    Allison, James
    Qian, Guoqi
    APPLIED MATHEMATICAL MODELLING, 2025, 137
  • [45] Cost Function Based on Gaussian Mixture Model for Parameter Estimation of a Chaotic Circuit with a Hidden Attractor
    Lao, Seng-Kin
    Shekofteh, Yasser
    Jafari, Sajad
    Sprott, Julien Clinton
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2014, 24 (01):
  • [46] CONVERGENCE-RATES OF CONTINUOUS-TIME STOCHASTIC ELS PARAMETER-ESTIMATION
    CHEN, HF
    MOORE, JB
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1987, 32 (03) : 267 - 269
  • [47] Robust L2E Parameter Estimation of Gaussian Mixture Models: Comparison with Expectation Maximization
    Thayasivam, Umashanger
    Kuruwita, Chinthaka
    Ramachandran, Ravi P.
    NEURAL INFORMATION PROCESSING, PT III, 2015, 9491 : 281 - 288
  • [48] Gaussian Mixture Model-Based Ensemble Kalman Filtering for State and Parameter Estimation for a PMMA Process
    Li, Ruoxia
    Prasad, Vinay
    Huang, Biao
    PROCESSES, 2016, 4 (02):
  • [49] The Gaussian Mixture Dynamic Conditional Correlation Model: Parameter Estimation, Value at Risk Calculation, and Portfolio Selection
    Galeano, Pedro
    Concepcion Ausin, M.
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2010, 28 (04) : 559 - 571
  • [50] Motion Parameter Estimation Combined Smoothing Filter Algorithm Based on Gaussian Mixture Probability Hypothesis Density
    Huang, Qingdong
    Li, Xiaorui
    Cao, Yiyuan
    Liu, Qing
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2022, 44 (07): : 2488 - 2495