Variable selection in model-based discriminant analysis

被引:26
|
作者
Maugis, C. [1 ]
Celeux, G. [2 ]
Martin-Magniette, M-L [3 ,4 ]
机构
[1] Univ Toulouse, INSA Toulouse, Inst Math Toulouse, F-31077 Toulouse 4, France
[2] Inria Saclay Ile de France, Sophia Antipolis, France
[3] UMR AgroParisTech INRA MIA 518, Paris, France
[4] ERL CNRS 8196, UEVE, URGV UMR INRA 1165, Evry, France
关键词
Discriminant; redundant or independent variables; Variable selection; Gaussian classification models; Linear regression; BIC; CLASSIFICATION;
D O I
10.1016/j.jmva.2011.05.004
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by a previous work on variable selection in model-based clustering. A BIC-like model selection criterion is proposed. It is optimized through two embedded forward stepwise variable selection algorithms for classification and linear regression. The model identifiability and the consistency of the variable selection criterion are proved. Numerical experiments on simulated and real data sets illustrate the interest of this variable selection methodology. In particular, it is shown that this well ground variable selection model can be of great interest to improve the classification performance of the quadratic discriminant analysis in a high dimension context. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:1374 / 1387
页数:14
相关论文
共 50 条
  • [41] An Efficient Variable Selection Method for Predictive Discriminant Analysis
    Iduseri A.
    Osemwenkhae J.E.
    Annals of Data Science, 2015, 2 (04) : 489 - 504
  • [42] Variable Selection in Canonical Discriminant Analysis for Family Studies
    Jin, Man
    Fang, Yixin
    BIOMETRICS, 2011, 67 (01) : 124 - 132
  • [43] Variable selection and error rate estimation in discriminant analysis
    Le Roux, NJ
    Steel, SJ
    Louw, N
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1997, 59 (03) : 195 - 219
  • [44] Input variable selection in kernel Fisher discriminant analysis
    Louw, N
    Steel, SJ
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 126 - +
  • [45] Partial Least Squares Discriminant Analysis Model Based on Variable Selection Applied to Identify the Adulterated Olive Oil
    Li, Xinhui
    Wang, Sulan
    Shi, Weimin
    Shen, Qi
    FOOD ANALYTICAL METHODS, 2016, 9 (06) : 1713 - 1718
  • [46] Discriminant features for model-based image databases
    Dong, A
    Bhanu, B
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 997 - 1000
  • [47] Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST
    Chris Fraley
    Adrian E. Raftery
    Journal of Classification, 2003, 20 : 263 - 286
  • [48] Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST
    Fraley, C
    Raftery, AE
    JOURNAL OF CLASSIFICATION, 2003, 20 (02) : 263 - 286
  • [49] SelvarClustMV: Variable selection approach in model-based clustering allowing for missing values
    Maugis-Rabusseau, Cathy
    Martin-Magniette, Marie-Laure
    Pelletier, Sandra
    JOURNAL OF THE SFDS, 2012, 153 (02): : 21 - 36
  • [50] Model-Based Segmentation Featuring Simultaneous Segment-Level Variable Selection
    Kim, Sunghoon
    Fong, Duncan K. H.
    Desarbo, Wayne S.
    JOURNAL OF MARKETING RESEARCH, 2012, 49 (05) : 725 - 736