Mutual Information-based Feature Selection Approach to Reduce High Dimension of Big Data

被引:2
|
作者
Win, Thee Zin [1 ]
Kham, Nang Saing Moon [2 ]
机构
[1] Univ Comp Studies, Informat Sci Dept, Yangon, Myanmar
[2] Univ Comp Studies, Fac Informat, Sci Dept, Yangon, Myanmar
关键词
Feature Selection; High Dimensional Data; Redundant Features; Mutual Information;
D O I
10.1145/3278312.3278316
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As increasing the massive amount of data demands effective and efficient mining strategies, practitioners and researchers are trying to develop scalable mining algorithms, machine learning algorithms and strategies to be successful data mining in turning mountains of data into nuggets. Data of high dimension significantly increases the memory storage requirements and computational costs for data analytics. Therefore, reducing dimension can mainly improve three data mining performance: speed of learning, predictive accuracy and simplicity and comprehensibility of mined result. Feature selection, data preprocessing technique, is effective and efficient in data mining, data analytics and machine learning problems particularly in high dimension reduction. Most feature selection algorithms can eliminate only irrelevant features but redundant features. Not only irrelevant features but also redundant features can degrade learning performance. Mutual information measured feature selection is proposed in this work to remove both irrelevant and redundant features.
引用
收藏
页码:3 / 7
页数:5
相关论文
共 50 条
  • [31] Supervised feature selection by clustering using conditional mutual information-based distances
    Martinez Sotoca, Jose
    Pla, Filiberto
    PATTERN RECOGNITION, 2010, 43 (06) : 2068 - 2081
  • [32] A Powerful Feature Selection approach based on Mutual Information
    El Akadi, Ali
    El Ouardighi, Abdeljalil
    Aboutajdine, Driss
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (04): : 116 - 121
  • [33] A filter approach to feature selection based on mutual information
    Huang, Jinjie
    Cai, Yunze
    Xu, Xiaoming
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 84 - 89
  • [34] Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data
    Sanchez, Luciano
    Rosario Suarez, M.
    Villar, J. R.
    Couso, Ines
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2008, 49 (03) : 607 - 622
  • [35] Mutual information-based label distribution feature selection for multi-label learning
    Qian, Wenbin
    Huang, Jintao
    Wang, Yinglong
    Shu, Wenhao
    KNOWLEDGE-BASED SYSTEMS, 2020, 195
  • [36] Mutual information-based radiomic feature selection with SHAP explainability for breast cancer diagnosis
    Oladimeji, Oladosu Oyebisi
    Ayaz, Hamail
    McLoughlin, Ian
    Unnikrishnan, Saritha
    RESULTS IN ENGINEERING, 2024, 24
  • [37] Mutual information-based feature selection for inverse mapping parameter updating of dynamical systems
    Kessels, Bas M.
    Fey, Rob H. B.
    van de Wouw, Nathan
    MULTIBODY SYSTEM DYNAMICS, 2024,
  • [38] Information-based optimal subdata selection for big data logistic regression
    Cheng, Qianshun
    Wang, HaiYing
    Yang, Min
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2020, 209 : 112 - 122
  • [39] Fuzzy Mutual Information-Based Multilabel Feature Selection With Label Dependency and Streaming Labels
    Liu, Jinghua
    Lin, Yaojin
    Ding, Weiping
    Zhang, Hongbo
    Du, Jixiang
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (01) : 77 - 91
  • [40] Comments on supervised feature selection by clustering using conditional mutual information-based distances
    Vinh, Nguyen X.
    Bailey, James
    PATTERN RECOGNITION, 2013, 46 (04) : 1220 - 1225