Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters

被引：0

作者：

Khalili, Abbas ^{[1
]}

Yang, Archer Yi ^{[1
,2
]}

Da, Xiaonan ^{[3
]}

机构：

[1] McGill Univ, Dept Math & Stat, Montreal, PQ, Canada

[2] Mila Quebec AI Inst, Montreal, PQ, Canada

[3] Stat Canada, Ottawa, ON, Canada

来源：

JOURNAL OF STATISTICAL PLANNING AND INFERENCE | 2025年 / 237卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Regularization; Variable selection; Mixture-of-experts; NONCONCAVE PENALIZED LIKELIHOOD; MAXIMUM-LIKELIHOOD; VARIABLE SELECTION; FINITE MIXTURE; REGRESSION-MODELS; EM ALGORITHM; IDENTIFIABILITY; REGULARIZATION;

D O I：

10.1016/j.jspi.2024.106250

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Mixture-of-experts provide flexible statistical models for a wide range of regression (supervised learning) problems. Often a large number of covariates (features) are available in many modern applications yet only a small subset of them is useful in explaining a response variable of interest. This calls for a feature selection device. In this paper, we present new group- feature selection and estimation methods for sparse mixture-of-experts models when the number of features can be nearly comparable to the sample size. We prove the consistency of the methods in both parameter estimation and feature selection. We implement the methods using a modified EM algorithm combined with proximal gradient method which results in a convenient closed-form parameter update in the M-step of the algorithm. We examine the finite-sample performance of the methods through simulations, and demonstrate their applications in a real data example on exploring relationships in body measurements.

引用

页数：17

共 50 条

[1] New estimation and feature selection methods in mixture-of-experts models
Khalili, Abbas
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (04): : 519 - 539
[2] Efficient Routing in Sparse Mixture-of-Experts
Shamsolmoali, Pourya (pshams55@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc.
[3] Sparse bridge estimation with a diverging number of parameters
Kwon, Sunghoon
Kim, Yongdai
Choi, Hosik
STATISTICS AND ITS INTERFACE, 2013, 6 (02) : 231 - 242
[4] Self-Supervised Mixture-of-Experts by Uncertainty Estimation
Zheng, Zhuobin
Yuan, Chun
Zhu, Xinrui
Lin, Zhihui
Cheng, Yangyang
Shi, Cheng
Ye, Jiahui
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5933 - 5940
[5] Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Park, Byeongjun
Go, Hyojun
Kim, Jin-Young
Woo, Sangmin
Ham, Seokil
Kim, Changick
COMPUTER VISION - ECCV 2024, PT LIII, 2025, 15111 : 461 - 477
[6] Janus: A Unified Distributed Training Framework for Sparse Mixture-of-Experts Models
Liu, Juncai
Wang, Jessie Hui
Jiang, Yimin
PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023, 2023, : 486 - 498
[7] Video Representation and Coding Using a Sparse Steered Mixture-of-Experts Network
Lange, Lieven
Verhack, Ruben
Sikora, Thomas
2016 PICTURE CODING SYMPOSIUM (PCS), 2016,
[8] HIERARCHICAL LEARNING OF SPARSE IMAGE REPRESENTATIONS USING STEERED MIXTURE-OF-EXPERTS
Jongebloed, Rolf
Verhack, Ruben
Lange, Lieven
Sikora, Thomas
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
[9] A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping
Courbariaux, Marie
De Santiago, Kylliann
Dalmasso, Cyril
Danjou, Fabrice
Bekadar, Samir
Corvol, Jean-Christophe
Martinez, Maria
Szafranski, Marie
Ambroise, Christophe
FRONTIERS IN GENETICS, 2022, 13
[10] Robustifying Routers Against Input Perturbations for Sparse Mixture-of-Experts Vision Transformers
Kada, Masahiro
Yoshihashi, Ryota
Ikehata, Satoshi
Kawakami, Rei
Sato, Ikuro
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2025, 6 : 276 - 283

← 1 2 3 4 5 →