Variable selection in discriminant analysis based on the location model for mixed variables

被引:10
|
作者
Mahat, Nor Idayu [1 ]
Krzanowski, Wojtek Janusz [2 ]
Hernandez, Adolfo [3 ]
机构
[1] Univ Utara Malaysia, Fac Quantitat Sci, Sintok 06010, Kedah, Malaysia
[2] Univ Exeter, Sch Engn Comp Sci & Math, Exeter EX4 4QE, Devon, England
[3] Univ Complutense, Escuela Univ Estudios Empresariales, Madrid 28003, Spain
关键词
Brier score; Cross-validation; Discriminant analysis; Error rate; Kullback-Leibler divergence; Location model; Non-parametric smoothing procedures; Variable selection;
D O I
10.1007/s11634-007-0009-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Non-parametric smoothing of the location model is a potential basis for discriminating between groups of objects using mixtures of continuous and categorical variables simultaneously. However, it may lead to unreliable estimates of parameters when too many variables are involved. This paper proposes a method for performing variable selection on the basis of distance between groups as measured by smoothed Kullback-Leibler divergence. Searching strategies using forward, backward and step-wise selections are outlined, and corresponding stopping rules derived from asymptotic distributional results are proposed. Results from a Monte Carlo study demonstrate the feasibility of the method. Examples on real data show that the method is generally competitive with, and sometimes is better than, other existing classification methods.
引用
收藏
页码:105 / 122
页数:18
相关论文
共 50 条
  • [31] VARIABLE SELECTION AND UPDATING IN MODEL-BASED DISCRIMINANT ANALYSIS FOR HIGH DIMENSIONAL DATA WITH FOOD AUTHENTICITY APPLICATIONS
    Murphy, Thomas Brendan
    Dean, Nema
    Raftery, Adrian E.
    ANNALS OF APPLIED STATISTICS, 2010, 4 (01): : 396 - 421
  • [32] Variable Selection in Canonical Discriminant Analysis for Family Studies
    Jin, Man
    Fang, Yixin
    BIOMETRICS, 2011, 67 (01) : 124 - 132
  • [33] Variable selection and error rate estimation in discriminant analysis
    Le Roux, NJ
    Steel, SJ
    Louw, N
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1997, 59 (03) : 195 - 219
  • [34] Input variable selection in kernel Fisher discriminant analysis
    Louw, N
    Steel, SJ
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 126 - +
  • [35] Partial Least Squares Discriminant Analysis Model Based on Variable Selection Applied to Identify the Adulterated Olive Oil
    Li, Xinhui
    Wang, Sulan
    Shi, Weimin
    Shen, Qi
    FOOD ANALYTICAL METHODS, 2016, 9 (06) : 1713 - 1718
  • [36] Principal component analysis based on a subset of variables: Variable selection and sensitivity analysis
    Tanaka, Yutaka
    Mori, Yuichi
    American Journal of Mathematical and Management Sciences, 1997, 17 (1-2): : 61 - 89
  • [37] Principal component analysis based on a subset of variables: Variable selection and sensitivity analysis
    Tanaka, Y
    Mori, Y
    AMERICAN JOURNAL OF MATHEMATICAL AND MANAGEMENT SCIENCES, VOL 17, NOS 1 AND 2, 1997: MULTIVARIATE STATISTICAL INFERENCE - MSI-2000L MULTIVARIATE STATISTICAL ANALYSIS IN HONOR OF PROFESSOR MINORU SIOTANI ON HIS 70TH BIRTHDAY, 1997, 17 (1&2): : 61 - 89
  • [38] High-dimensional AICs for selection of variables in discriminant analysis
    Sakurai, Tetsuro
    Nakada, Takeshi
    Fujikoshi, Yasunori
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2013, 75 (01): : 1 - 25
  • [39] High-dimensional AICs for selection of variables in discriminant analysis
    Sakurai T.
    Nakada T.
    Fujikoshi Y.
    Sankhya A, 2013, 75 (1): : 1 - 25
  • [40] ESTIMATION OF ERROR RATES IN DISCRIMINANT-ANALYSIS WITH SELECTION OF VARIABLES
    SNAPINN, SM
    KNOKE, JD
    BIOMETRICS, 1989, 45 (01) : 289 - 299