A framework for feature selection through boosting

被引:78
|
作者
Alsahaf, Ahmad [1 ]
Petkov, Nicolai [1 ]
Shenoy, Vikram [2 ]
Azzopardi, George [1 ]
机构
[1] Univ Groningen, Bernoulli Inst Math Comp Sci & Artificial Intelli, POB 407, NL-9700 AK Groningen, Netherlands
[2] Northeastern Univ, Khoury Coll Comp Sci, West Village Residence Complex H, Boston, MA 02115 USA
关键词
Feature selection; Boosting; Ensemble learning; XGBoost; MUTUAL INFORMATION; OPTIMIZATION;
D O I
10.1016/j.eswa.2021.115895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Investigating boosting techniques' efficacy in feature selection: A comparative analysis
    Ahmed, Ubaid
    Mahmood, Anzar
    Tunio, Majid Ali
    Hafeez, Ghulam
    Khan, Ahsan Raza
    Razzaq, Sohail
    ENERGY REPORTS, 2024, 11 : 3521 - 3532
  • [42] Boosting with Feature Selection Technique for Screening and Predicting Adolescents Depression
    Thanathamathee, Putthiporn
    2014 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND IT'S APPLICATIONS (DICTAP), 2014, : 23 - 27
  • [43] An improved boosting based on feature selection for corporate bankruptcy prediction
    Wang, Gang
    Ma, Jian
    Yang, Shanlin
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (05) : 2353 - 2361
  • [44] Boosting capuchin search with stochastic learning strategy for feature selection
    Mohamed Abd Elaziz
    Salima Ouadfel
    Rehab Ali Ibrahim
    Neural Computing and Applications, 2023, 35 : 14061 - 14080
  • [45] Boosting decision stumps for dynamic feature selection on data streams
    Barddal, Jean Paul
    Enembreck, Fabricio
    Gomes, Heitor Murilo
    Bifet, Albert
    Pfahringer, Bernhard
    INFORMATION SYSTEMS, 2019, 83 : 13 - 29
  • [46] Improving Software Quality Estimation by Combining Boosting and Feature Selection
    Gao, Kehan
    Khoshgoftaar, Taghi
    Napolitano, Amri
    2013 12TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2013), VOL 1, 2013, : 27 - 33
  • [47] BoostFS: A Boosting-Based Irrelevant Feature Selection Algorithm
    Miao, Qi-Guang
    Cao, Ying
    Song, Jian-Feng
    Liu, Jiachen
    Quan, Yining
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (07)
  • [48] Fusion Based Blind Image Steganalysis by Boosting Feature Selection
    Dong, Jing
    Chen, Xiaochuan
    Guo, Lei
    Tan, Tieniu
    DIGITAL WATERMARKING, PROCEEDINGS, 2008, 5041 : 87 - 98
  • [49] Boosting capuchin search with stochastic learning strategy for feature selection
    Elaziz, Mohamed Abd
    Ouadfel, Salima
    Ibrahim, Rehab Ali
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (19): : 14061 - 14080
  • [50] Multiclass Intrusion Detection in IoT Using Boosting and Feature Selection
    Hamdouchi, Abderrahmane
    Idri, Ali
    GOOD PRACTICES AND NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 3, WORLDCIST 2024, 2024, 987 : 128 - 137