Hierarchical Routing Mixture of Experts

被引：2

作者：

Zhao, Wenbo ^{[1
]}

Gao, Yang ^{[1
]}

Memon, Shahan Ali ^{[1
]}

Raj, Bhiksha ^{[1
]}

Singh, Rita ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年

关键词：

SUPPORT VECTOR MACHINES; APPROXIMATION; PREDICTION;

D O I：

10.1109/ICPR48806.2021.9412813

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In regression tasks, the data distribution is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts. The classifier nodes jointly soft-partition the input-output space based on the natural separateness of multimodal data. This enables simple leaf experts to be effective for prediction. Further, we develop a probabilistic framework for the HRME model and propose a recursive Expectation-Maximization (EM) based algorithm to learn both the tree structure and the expert models. Experiments on a collection of regression tasks validate our method's effectiveness compared to various other regression models.

引用

页码：7900 / 7906

页数：7

共 50 条

[1] Dropout regularization in hierarchical mixture of experts
Irsoy, Ozan
Alpaydin, Ethem
NEUROCOMPUTING, 2021, 419 : 148 - 156
[2] SPARSE BAYESIAN HIERARCHICAL MIXTURE OF EXPERTS
Mossavat, Iman
Amft, Oliver
2011 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2011, : 653 - 656
[3] Mixture of experts classification using a hierarchical mixture model
Titsias, MK
Likas, A
NEURAL COMPUTATION, 2002, 14 (09) : 2221 - 2244
[4] STABLEMOE: Stable Routing Strategy for Mixture of Experts
Dai, Damai
Dong, Li
Ma, Shuming
Zheng, Bo
Sui, Zhifang
Chang, Baobao
Wei, Furu
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7085 - 7095
[5] Efficient Routing in Sparse Mixture-of-Experts
Shamsolmoali, Pourya (pshams55@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc.
[6] Mixture-of-Experts with Expert Choice Routing
Zhou, Yanqi
Lei, Tao
Liu, Hanxiao
Du, Nan
Huang, Yanping
Zhao, Vincent Y.
Dai, Andrew
Chen, Zhifeng
Le, Quoc
Laudon, James
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[7] Advances in using hierarchical mixture of experts for signal classification
Ramamurti, V
Ghosh, J
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3569 - 3572
[8] Intelligent sensor validation by a hierarchical mixture of experts network
Yen, GG
Feng, W
IECON 2000: 26TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, VOLS 1-4: 21ST CENTURY TECHNOLOGIES AND INDUSTRIAL OPPORTUNITIES, 2000, : 155 - 160
[9] Sparse Bayesian Hierarchical Mixture of Experts and Variational Inference
Iikubo, Yuji
Horii, Shunsuke
Matsushima, Toshiyasu
PROCEEDINGS OF 2018 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS (ISITA2018), 2018, : 60 - 64
[10] Behavioral partitioning in a Hierarchical mixture of experts using K-Best-Experts algorithm
Fard, Mahdi Milani
Bakhtiary, Amir-Hossein
2007 IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE, VOLS 1 AND 2, 2007, : 106 - +

← 1 2 3 4 5 →