A feature selection algorithm based on Hoeffding inequality and mutual information

被引:0
|
作者
Yin, Chunyong [1 ]
Feng, Lu [1 ]
Ma, Luyu [1 ]
Yin, Zhichao [2 ]
Wang, Jin [1 ]
机构
[1] School of Computer and Software, Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Jiangsu Engineering Center of Network Monitoring Nanjing University of Information Science and Technology, Nanjing, China
[2] Nanjing No.1 Middle School, Nanjing, China
关键词
Classification (of information) - Data mining;
D O I
10.14257/ijsip.2015.8.11.39
中图分类号
学科分类号
摘要
With the rapid development of the Internet, the application of data mining in the Internet is becoming more and more extensive. However, the data source’s complex feature redundancy leads that data mining process becomes very inefficient and complex. So feature selection research is essential to make data mining more efficient and simple. In this paper, we propose a new way to measure the correlation degree of internal features of dataset which is a mutation of mutual information. Additionally we also introduce Hoeffding inequality as constraint of constructing algorithm. During the experiments, we use C4.5 classification algorithm as test algorithm and compare HSF with BIF(feature selection algorithm based on mutual information). Experiments results show that HSF performances better than BIF[1] in TP and FP rate, what’s more the feature subset obtained by HSF can significantly improve the TP, FP and memory usage of C4.5 classification algorithm. © 2015 SERSC.
引用
收藏
页码:433 / 444
相关论文
共 50 条
  • [41] Improved Feature Selection Based On Normalized Mutual Information
    Li Yin
    Ma Xingfei
    Yang Mengxi
    Zhao Wei
    Gu Wenqiang
    14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015), 2015, : 518 - 522
  • [42] Feature selection based on mutual information with correlation coefficient
    Zhou, Hongfang
    Wang, Xiqian
    Zhu, Rourou
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5457 - 5474
  • [43] Simultaneous feature selection and discretization based on mutual information
    Sharmin, Sadia
    Shoyaib, Mohammad
    Ali, Amin Ahsan
    Khan, Muhammad Asif Hossain
    Chae, Oksam
    PATTERN RECOGNITION, 2019, 91 : 162 - 174
  • [44] A SURVEY FOR STUDY OF FEATURE SELECTION BASED ON MUTUAL INFORMATION
    Su, Xiangchenyang
    Liu, Fang
    2018 9TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2018,
  • [45] A filter approach to feature selection based on mutual information
    Huang, Jinjie
    Cai, Yunze
    Xu, Xiaoming
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 84 - 89
  • [46] An Improved Feature Selection Algorithm with Conditional Mutual Information for Classification Problems
    Palanichamy, Jaganathan
    Ramasamy, Kuppuchamy
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
  • [47] Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information
    Alzubaidi, Abeer
    Cosma, Georgina
    Brown, David
    Pockley, A. Graham
    2016 9TH INTERNATIONAL CONFERENCE ON INTERACTIVE TECHNOLOGIES AND GAMES (ITAG), 2016, : 70 - 76
  • [49] Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy
    Gu, Xiangyuan
    Guo, Jichang
    Xiao, Lijun
    Li, Chongyi
    APPLIED INTELLIGENCE, 2022, 52 (02) : 1436 - 1447
  • [50] Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data
    Siddiqi, Umair F.
    Sait, Sadiq M.
    Kaynak, Okyay
    IEEE ACCESS, 2020, 8 (08): : 9597 - 9609