A local information-based feature-selection algorithm for data regression

被引:18
|
作者
Peng, Xinjun [1 ,2 ]
Xu, Dong [1 ]
机构
[1] Shanghai Normal Univ, Dept Math, Shanghai 200234, Peoples R China
[2] Sci Comp Key Lab Shanghai Univ, Shanghai 200234, Peoples R China
关键词
Feature selection; Local information; Irrelevant feature; Least squares loss; Gradient descent; Data regression; FEATURE SUBSET-SELECTION; GENE SELECTION; CLASSIFICATION; RELEVANCE;
D O I
10.1016/j.patcog.2013.02.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel feature-selection algorithm for data regression with a lot of irrelevant features. The proposed method is based on well-established machine-learning technique without any assumption about the underlying data distribution. The key idea in this method is to decompose an arbitrarily complex nonlinear problem into a set of locally linear ones through local information, and to learn globally feature relevance within the least squares loss framework. In contrast to other feature-selection algorithms for data regression, the learning of this method is efficient since the solution can be readily found through gradient descent with a simple update rule. Experiments on some synthetic and real-world data sets demonstrate the viability of our formulation of the feature-selection problem and the effectiveness of our algorithm. Crown Copyright (C) 2013 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:2519 / 2530
页数:12
相关论文
共 50 条
  • [1] Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data
    Siddiqi, Umair F.
    Sait, Sadiq M.
    Kaynak, Okyay
    IEEE ACCESS, 2020, 8 (08): : 9597 - 9609
  • [2] Distributed information-based optimal sub-data selection algorithm for big data logistic regression
    Wan, Xiangxin
    Liu, Yanyan
    Ye, Xin
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025,
  • [3] Feature-selection ability of the decision-tree algorithm and the impact of feature-selection/extraction on decision-tree results based on hyperspectral data
    Wang, Y. Y.
    Li, J.
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2008, 29 (10) : 2993 - 3010
  • [4] Information-based optimal subdata selection for big data logistic regression
    Cheng, Qianshun
    Wang, HaiYing
    Yang, Min
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2020, 209 : 112 - 122
  • [5] Ambiguity Measure Feature-Selection Algorithm
    Mengle, Saket S. R.
    Goharian, Nazli
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (05): : 1037 - 1050
  • [6] Information-Based Optimal Subdata Selection for Big Data Linear Regression
    Wang, HaiYing
    Yang, Min
    Stufken, John
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (525) : 393 - 405
  • [7] Mutual information-based feature selection for radiomics
    Oubel, Estanislao
    Beaumont, Hubert
    Iannessi, Antoine
    MEDICAL IMAGING 2016: PACS AND IMAGING INFORMATICS: NEXT GENERATION AND INNOVATIONS, 2016, 9789
  • [8] NONPARAMETRIC FEATURE-SELECTION METHOD BASED ON LOCAL INTERCLASS STRUCTURE
    ICHINO, M
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1981, 11 (04): : 289 - 296
  • [9] Mutual Information-based Feature Selection from Set-valued Data
    Shu, Wenhao
    Qian, Wenbin
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 733 - 739
  • [10] Application of mutual information-based sequential feature selection to ISBSG mixed data
    Fernandez-Diego, Marta
    Gonzalez-Ladron-de-Guevara, Fernando
    SOFTWARE QUALITY JOURNAL, 2018, 26 (04) : 1299 - 1325