Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

被引:0
|
作者
Nguyen, Huy [1 ]
Nguyen, TrungTin [2 ,3 ]
Nguyen, Khai [1 ]
Ho, Nhat [1 ]
机构
[1] Univ Texas Austin, Dept Stat & Data Sci, Austin, TX 78712 USA
[2] Univ Queensland, Sch Math & Phys, Brisbane, Qld, Australia
[3] Univ Grenoble Alpes, Inria, LJK, Grenoble INP,CNRS, F-38000 Grenoble, France
关键词
MAXIMUM-LIKELIHOOD; OF-EXPERTS; HIERARCHICAL MIXTURES; FEATURE-SELECTION; IDENTIFIABILITY; REGRESSION; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applications of machine learning and statistics. Despite its popularity in practice, a satisfactory level of theoretical understanding of the MoE model is far from complete. To shed new light on this problem, we provide a convergence analysis for maximum likelihood estimation (MLE) in the Gaussian-gated MoE model. The main challenge of that analysis comes from the inclusion of covariates in the Gaussian gating functions and expert networks, which leads to their intrinsic interaction via some partial differential equations with respect to their parameters. We tackle these issues by designing novel Voronoi loss functions among parameters to accurately capture the heterogeneity of parameter estimation rates. Our findings reveal that the MLE has distinct behaviors under two complement settings of location parameters of the Gaussian gating functions, namely when all these parameters are non-zero versus when at least one among them vanishes. Notably, these behaviors can be characterized by the solvability of two different systems of polynomial equations. Finally, we conduct a simulation study to empirically verify our theoretical results.
引用
收藏
页数:33
相关论文
共 50 条
  • [21] Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models
    Ruan, Lingyan
    Yuan, Ming
    Zou, Hui
    NEURAL COMPUTATION, 2011, 23 (06) : 1605 - 1622
  • [22] Parameter estimation for autoregressive Gaussian-mixture processes: The EMAX algorithm
    Verbout, SM
    Ooi, JM
    Ludwig, JT
    Oppenheim, AV
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1998, 46 (10) : 2744 - 2756
  • [23] Gaussian mixture parameter estimation for cognitive radio and network surveillance applications
    Singh, LN
    Dattatreya, GR
    VTC2005-FALL: 2005 IEEE 62ND VEHICULAR TECHNOLOGY CONFERENCE, 1-4, PROCEEDINGS, 2005, : 1993 - 1997
  • [24] Parameter estimation for autoregressive Gaussian-mixture processes: The EMAX algorithm
    Verbout, SM
    Ludwig, JT
    Oppenheim, AV
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 3549 - 3552
  • [25] THE COST OF PRIVACY: OPTIMAL RATES OF CONVERGENCE FOR PARAMETER ESTIMATION WITH DIFFERENTIAL PRIVACY
    Cai, T. Tony
    Wang, Yichen
    Zhang, Linjun
    ANNALS OF STATISTICS, 2021, 49 (05): : 2825 - 2850
  • [26] LS-EM algorithm of parameter estimation for Gaussian mixture autoregressive model
    Electronic Engineering College, Navy Engineering University, Wuhan 430033, China
    Wuhan Ligong Daxue Xuebao (Jiaotong Kexue Yu Gongcheng Ban), 2006, 6 (1061-1064):
  • [27] CONVERGENCE RATES OF PARAMETER ESTIMATION FOR SOME WEAKLY IDENTIFIABLE FINITE MIXTURES
    Ho, Nhat
    Nguyen, Xuanlong
    ANNALS OF STATISTICS, 2016, 44 (06): : 2726 - 2755
  • [28] Almost sure parameter estimation and convergence rates for hidden Markov models
    Elliott, RJ
    Moore, JB
    SYSTEMS & CONTROL LETTERS, 1997, 32 (04) : 203 - 207
  • [29] Parameter Estimation for Gaussian Mixture Processes based on Expectation-Maximization Method
    Xia, Xue
    Zhang, Xuebo
    Chen, Xiaohui
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 519 - 523
  • [30] Parameter Estimation of Gaussian Mixture Model and Its Application in Multimode Process Monitoring
    Gao, Junfeng
    Zhou, Lingke
    Du, Baozhu
    PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 2896 - 2901