A Robust-Equitable Measure for Feature Ranking and Selection

被引:0
|
作者
Ding, A. Adam [1 ]
Dy, Jennifer G. [2 ]
Li, Yi [1 ]
Chang, Yale [2 ]
机构
[1] Northeastern Univ, Dept Math, Boston, MA 02115 USA
[2] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
dependence measure; feature selection; copula; equitability; mutual information; MUTUAL INFORMATION; DENSITY-FUNCTION; CONVERGENCE; DEPENDENCE; CLASSIFICATION; RELEVANCE; ENTROPY; RATES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many applications, not all the features used to represent data samples are important. Often only a few features are relevant for the prediction task. The choice of dependence measures often affect the final result of many feature selection methods. To select features that have complex nonlinear relationships with the response variable, the dependence measure should be equitable, a concept proposed by Reshef et al. (2011); that is, the dependence measure treats linear and nonlinear relationships equally. Recently, Kinney and Atwal (2014) gave a mathematical definition of self-equitability. In this paper, we introduce a new concept of robust-equitability and identify a robust-equitable copula dependence measure, the robust copula dependence (RCD) measure. RCD is based on the L-1-distance of the copula density from uniform and we show that it is equitable under both equitability definitions. We also prove theoretically that RCD is much easier to estimate than mutual information. Because of these theoretical properties, the RCD measure has the following advantages compared to existing dependence measures: it is robust to different relationship forms and robust to unequal sample sizes of different features. Experiments on both synthetic and real-world data sets confirm the theoretical analysis, and illustrate the advantage of using the dependence measure RCD for feature selection.
引用
收藏
页码:1 / 46
页数:46
相关论文
共 50 条
  • [21] Neighborhood Ranking-Based Feature Selection
    Ipkovich, Adam
    Abonyi, Janos
    IEEE ACCESS, 2024, 12 : 20152 - 20168
  • [22] Granularity self-information based uncertainty measure for feature selection and robust classification
    An, Shuang
    Xiao, Qijin
    Wang, Changzhong
    Zhao, Suyun
    FUZZY SETS AND SYSTEMS, 2023, 470
  • [23] Heuristic search over a ranking for feature selection
    Ruiz, R
    Riquelme, JC
    Aguilar-Ruiz, JS
    COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS, 2005, 3512 : 742 - 749
  • [24] A Robust Feature Selection Algorithm
    Chandra, B.
    2016 9TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2016), 2016, : 308 - 313
  • [25] SEQUENTIAL SAMPLING FOR BAYESIAN ROBUST RANKING AND SELECTION
    Zhang, Xiaowei
    Ding, Liang
    2016 WINTER SIMULATION CONFERENCE (WSC), 2016, : 758 - 769
  • [26] Robust subtractive stability measures for fast and exhaustive feature importance ranking and selection in generalised linear models
    Smith, Connor
    Guennewig, Boris
    Muller, Samuel
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2022, 64 (03) : 339 - 355
  • [27] Feature Ranking of Large, Robust, and Weighted Clustering Result
    Saarela, Mirka
    Hamalainen, Joonas
    Karkkainen, Tommi
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I, 2017, 10234 : 96 - 109
  • [28] An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers
    Singh D.A.A.G.
    Balamurugan S.A.A.
    Leavline E.J.
    International Journal of Automation and Computing, 2015, 12 (05) : 511 - 517
  • [29] An Unsupervised Feature Selection Algorithm with Feature Ranking for Maximizing Performance of the Classifiers
    Danasingh Asir Antony Gnana Singh
    Subramanian Appavu Alias Balamurugan
    Epiphany Jebamalar Leavline
    International Journal of Automation and Computing, 2015, 12 (05) : 511 - 517
  • [30] Feature selection using consistency measure
    Dash, M
    Liu, H
    Motoda, H
    DISCOVERY SCIENCE, PROCEEDINGS, 1999, 1721 : 319 - 320