Clusterwise Regression Using Dirichlet Mixtures

被引：0

作者：

Kang, Changku ^{[1
]}

Ghosal, Subhashis ^{[2
]}

机构：

[1] Bank Korea, Econ Stat Dept, 110,3 Ga, Seoul, South Korea

[2] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA

来源：

ADVANCES IN MULTIVARIATE STATISTICAL METHODS | 2009年 / 4卷

关键词：

D O I：

暂无

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

The article describes a method of estimating nonparametric regression function through Bayesian clustering. The basic working assumption in the underlying method is that the population is a union of several hidden subpopulations in each of which a different linear regression is in force and the overall nonlinear regression function arises as a result of superposition of these linear regression functions. A Bayesian clustering technique based on Dirichlet mixture process is used to identify clusters which correspond to samples from these hidden subpopulations. The clusters are formed automatically within a Markov chain Monte-Carlo scheme arising from a Dirichlet mixture process prior for the density of the regressor variable. The number of components in the mixing distribution is thus treated as unknown allowing considerable flexibility in modeling. Within each cluster, we estimate model parameters by the standard least square method or some of its variations. Automatic model averaging takes care of the uncertainty in classifying a new observation to the obtained clusters. As opposed to most commonly used nonparametric regression estimates which break up the sample locally, our method splits the sample into a number of subgroups not depending on the dimension of the regressor variable. Thus our method avoids the curse of dimensionality problem. Through extensive simulations, we compare the performance of our proposed method with that of commonly used nonparametric regression techniques. We conclude that when the model assumption holds and the subpopulation are not highly overlapping, our method has smaller estimation error particularly if the dimension is relatively large.

引用

页码：305 / +

页数：3

共 50 条

[31] PLS approach for clusterwise linear regression on functional data
Preda, C
Saporta, G
CLASSIFICATION, CLUSTERING, AND DATA MINING APPLICATIONS, 2004, : 167 - 176
[32] A weighted least-squares approach to clusterwise regression
Rainer Schlittgen
AStA Advances in Statistical Analysis, 2011, 95 : 205 - 217
[33] Clusterwise linear regression modeling with soft scale constraints
Di Mari, Roberto
Rocci, Roberto
Gattone, Stefano Antonio
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 91 : 160 - 178
[34] Seemingly unrelated clusterwise linear regression for contaminated data
Gabriele Perrone
Gabriele Soffritti
Statistical Papers, 2023, 64 : 883 - 921
[35] An algorithm for clusterwise linear regression based on smoothing techniques
Bagirov, Adil M.
Ugon, Julien
Mirzayeva, Hijran G.
OPTIMIZATION LETTERS, 2015, 9 (02) : 375 - 390
[36] Missing Value Imputation via Clusterwise Linear Regression
Karmitsa, Napsu
Taheri, Sona
Bagirov, Adil
Makinen, Pauliina
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1889 - 1901
[37] Seemingly unrelated clusterwise linear regression for contaminated data
Perrone, Gabriele
Soffritti, Gabriele
STATISTICAL PAPERS, 2023, 64 (03) : 883 - 921
[38] A FAST ALGORITHM FOR CLUSTERWISE LINEAR ABSOLUTE DEVIATIONS REGRESSION
MEIER, J
OR SPEKTRUM, 1987, 9 (03) : 187 - 189
[39] Nonlinear Models Using Dirichlet Process Mixtures
Shahbaba, Babak
Neal, Radford
JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1829 - 1850
[40] Scalable and Near-Optimal ε-Tube Clusterwise Regression
Chembu, Aravinth
Sanner, Scott
Khalil, Elias B.
INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2023, 2023, 13884 : 254 - 263

← 1 2 3 4 5 →