Variable selection in discriminant analysis based on the location model for mixed variables

被引:10
|
作者
Mahat, Nor Idayu [1 ]
Krzanowski, Wojtek Janusz [2 ]
Hernandez, Adolfo [3 ]
机构
[1] Univ Utara Malaysia, Fac Quantitat Sci, Sintok 06010, Kedah, Malaysia
[2] Univ Exeter, Sch Engn Comp Sci & Math, Exeter EX4 4QE, Devon, England
[3] Univ Complutense, Escuela Univ Estudios Empresariales, Madrid 28003, Spain
关键词
Brier score; Cross-validation; Discriminant analysis; Error rate; Kullback-Leibler divergence; Location model; Non-parametric smoothing procedures; Variable selection;
D O I
10.1007/s11634-007-0009-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Non-parametric smoothing of the location model is a potential basis for discriminating between groups of objects using mixtures of continuous and categorical variables simultaneously. However, it may lead to unreliable estimates of parameters when too many variables are involved. This paper proposes a method for performing variable selection on the basis of distance between groups as measured by smoothed Kullback-Leibler divergence. Searching strategies using forward, backward and step-wise selections are outlined, and corresponding stopping rules derived from asymptotic distributional results are proposed. Results from a Monte Carlo study demonstrate the feasibility of the method. Examples on real data show that the method is generally competitive with, and sometimes is better than, other existing classification methods.
引用
收藏
页码:105 / 122
页数:18
相关论文
共 50 条
  • [21] COMPUTATIONS FOR VARIABLE SELECTION IN DISCRIMINANT-ANALYSIS
    MCCABE, GP
    TECHNOMETRICS, 1975, 17 (01) : 103 - 109
  • [22] VARIABLE SELECTION IN HETEROSCEDASTIC DISCRIMINANT-ANALYSIS
    FATTI, LP
    HAWKINS, DM
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1986, 81 (394) : 494 - 500
  • [23] Variable selection in discriminant analysis in the presence of outliers
    Steel, SJ
    Louw, N
    ITI 2001: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2001, : 251 - 256
  • [24] Analysis of new variable selection methods for discriminant analysis
    Pacheco, Joaquin
    Casado, Silvia
    Nunez, Laura
    Gomez, Olga
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (03) : 1463 - 1478
  • [25] Minimum distance probability discriminant analysis for mixed variables
    Núñez, M
    Villarroya, A
    Oller, JM
    BIOMETRICS, 2003, 59 (02) : 248 - 253
  • [26] Discriminant analysis using the unweighted sum of binary variables: A comparison of model selection methods
    Langbehn, DR
    Woolson, RF
    STATISTICS IN MEDICINE, 1997, 16 (23) : 2679 - 2700
  • [27] Partial Least Squares Discriminant Analysis Model Based on Variable Selection Applied to Identify the Adulterated Olive Oil
    Xinhui Li
    Sulan Wang
    Weimin Shi
    Qi Shen
    Food Analytical Methods, 2016, 9 : 1713 - 1718
  • [28] Variable Selection in PLS Discriminant Analysis via the Disco
    Simonetti, Biagio
    Lucadamo, Antonio
    Rodriguez, Maria R. G.
    CURRENT ANALYTICAL CHEMISTRY, 2012, 8 (02) : 266 - 272
  • [29] DALASS: Variable selection in discriminant analysis via the LASSO
    Trendafilov, Nickolay T.
    Jolliffe, Ian T.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (08) : 3718 - 3736
  • [30] An Efficient Variable Selection Method for Predictive Discriminant Analysis
    Iduseri A.
    Osemwenkhae J.E.
    Annals of Data Science, 2015, 2 (04) : 489 - 504