A simple model-based approach to variable selection in classification and clustering

被引:3
|
作者
Partovi Nia, Vahid [1 ,2 ]
Davison, Anthony C. [3 ]
机构
[1] Polytech Montreal, GERAD Res Ctr, Montreal, PQ J3T 1J4, Canada
[2] Polytech Montreal, Dept Math & Ind Engn, Montreal, PQ J3T 1J4, Canada
[3] Ecole Polytech Fed Lausanne, EPFL FSB MATHAA STAT, CH-1015 Lausanne, Switzerland
基金
瑞士国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Classification; Clustering; high-dimensional data; hierarchical partitioning; Laplace distribution; mixture model; variable selection; MIXTURE MODEL; EXPRESSION; BAYES;
D O I
10.1002/cjs.11241
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Clustering and classification of replicated data is often performed using classical techniques that inappropriately treat the data as unreplicated, or by complex modern ones that are computationally demanding. In this paper, we introduce a simple approach based on a spike-and-slab mixture model that is fast, automatic, allows classification, clustering and variable selection in a single framework, and can handle replicated or unreplicated data. Simulation shows that our approach compares well with other recently proposed methods. The ideas are illustrated by application to microarray and metabolomic data. The Canadian Journal of Statistics 43: 157-175; 2015 (c) 2015 Statistical Society of Canada
引用
收藏
页码:157 / 175
页数:19
相关论文
共 50 条
  • [1] Variable selection for model-based clustering
    Raftery, AE
    Dean, N
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 168 - 178
  • [2] Variable selection in model-based clustering and discriminant analysis with a regularization approach
    Gilles Celeux
    Cathy Maugis-Rabusseau
    Mohammed Sedki
    Advances in Data Analysis and Classification, 2019, 13 : 259 - 278
  • [3] Variable selection in model-based clustering and discriminant analysis with a regularization approach
    Celeux, Gilles
    Maugis-Rabusseau, Cathy
    Sedki, Mohammed
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (01) : 259 - 278
  • [4] Variable selection methods for model-based clustering
    Fop, Michael
    Murphy, Thomas Brendan
    STATISTICS SURVEYS, 2018, 12 : 18 - 65
  • [5] SelvarClustMV: Variable selection approach in model-based clustering allowing for missing values
    Maugis-Rabusseau, Cathy
    Martin-Magniette, Marie-Laure
    Pelletier, Sandra
    JOURNAL OF THE SFDS, 2012, 153 (02): : 21 - 36
  • [6] Penalized model-based clustering with application to variable selection
    Pan, Wei
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 1145 - 1164
  • [7] Comparing Model Selection and Regularization Approaches to Variable Selection in Model-Based Clustering
    Celeux, Gilles
    Martin-Magniette, Marie-Laure
    Maugis-Rabusseau, Cathy
    Raftery, Adrian E.
    JOURNAL OF THE SFDS, 2014, 155 (02): : 57 - 71
  • [8] Variable selection in model-based clustering: A general variable role modeling
    Maugis, C.
    Celeux, G.
    Martin-Magniette, M. -L.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (11) : 3872 - 3882
  • [9] Variable selection for model-based high-dimensional clustering
    Wang, Sijian
    Zhu, Ji
    PREDICTION AND DISCOVERY, 2007, 443 : 177 - +
  • [10] Estimation and model selection for model-based clustering with the conditional classification likelihood
    Baudry, Jean-Patrick
    ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01): : 1041 - 1077