An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

被引:18
|
作者
Bian, Jing [1 ,2 ]
Peng, Xin-guang [1 ]
Wang, Ying [1 ]
Zhang, Hai [3 ]
机构
[1] Taiyuan Univ Technol, Coll Comp Sci & Technol, Yingze St 79, Taiyuan 030024, Peoples R China
[2] Shanxi Med Coll Continuing Educ, Ctr Informat & Network, Shuangtasi St 22, Taiyuan 030012, Peoples R China
[3] Shanxi Branch Agr Bank China, Technol & Prod Management, Nanneihuan St 33, Taiyuan 030024, Peoples R China
基金
美国国家科学基金会;
关键词
OPTIMIZATION; ACQUISITION; DEFECT; SMOTE;
D O I
10.1155/2016/8752181
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs) and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 and k-nearest neighbor (KNN). The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Cost-Sensitive Feature Selection for Class Imbalance Problem
    Bach, Malgorzata
    Werner, Aleksandra
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 182 - 194
  • [2] Cost-sensitive feature reduction applied to a hybrid genetic algorithm
    Lavrae, N.
    Gamberger, D.
    Turney, P.
    Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), 1160
  • [3] Cost-sensitive max-margin feature selection for SVM using alternated sorting method genetic algorithm
    Aram, Khalid Y.
    Lam, Sarah S.
    Khasawneh, Mohammad T.
    KNOWLEDGE-BASED SYSTEMS, 2023, 267
  • [4] Training cost-sensitive neural networks with methods addressing the class imbalance problem
    Zhou, ZH
    Liu, XY
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (01) : 63 - 77
  • [5] A Cost-Sensitive Sparse Representation Based Classification for Class-Imbalance Problem
    Liu, Zhenbing
    Gao, Chunyang
    Yang, Huihua
    He, Qijia
    SCIENTIFIC PROGRAMMING, 2016, 2016
  • [6] Cost-Sensitive Feature Selection on Heterogeneous Data
    Qian, Wenbin
    Shu, Wenhao
    Yang, Jun
    Wang, Yinglong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 397 - 408
  • [7] Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection
    Lee, Sang Min
    Kim, Dong Seong
    Park, Jong Sou
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (06) : 944 - 960
  • [8] MULTI-LABEL COST-SENSITIVE FEATURE SELECTION ALGORITHM IN INCOMPLETE DATA
    Huang, Qin
    Qian, Wenbin
    Shu, Wenhao
    Wu, Binglong
    Feng, Shuangshuang
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2018, : 56 - 62
  • [9] The influence of class imbalance on cost-sensitive learning: An empirical study
    Liu, Xu-Ying
    Zhou, Zhi-Hua
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 970 - +
  • [10] Ensemble of Cost-Sensitive Hypernetworks for Class-Imbalance Learning
    Wang, Jin
    Huang, Ping-li
    Sun, Kai-wei
    Cao, Bao-lin
    Zhao, Rui
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 1883 - 1888