Mutual Information-based Feature Selection Approach to Reduce High Dimension of Big Data

被引:2
|
作者
Win, Thee Zin [1 ]
Kham, Nang Saing Moon [2 ]
机构
[1] Univ Comp Studies, Informat Sci Dept, Yangon, Myanmar
[2] Univ Comp Studies, Fac Informat, Sci Dept, Yangon, Myanmar
关键词
Feature Selection; High Dimensional Data; Redundant Features; Mutual Information;
D O I
10.1145/3278312.3278316
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As increasing the massive amount of data demands effective and efficient mining strategies, practitioners and researchers are trying to develop scalable mining algorithms, machine learning algorithms and strategies to be successful data mining in turning mountains of data into nuggets. Data of high dimension significantly increases the memory storage requirements and computational costs for data analytics. Therefore, reducing dimension can mainly improve three data mining performance: speed of learning, predictive accuracy and simplicity and comprehensibility of mined result. Feature selection, data preprocessing technique, is effective and efficient in data mining, data analytics and machine learning problems particularly in high dimension reduction. Most feature selection algorithms can eliminate only irrelevant features but redundant features. Not only irrelevant features but also redundant features can degrade learning performance. Mutual information measured feature selection is proposed in this work to remove both irrelevant and redundant features.
引用
收藏
页码:3 / 7
页数:5
相关论文
共 50 条
  • [1] Mutual information-based feature selection for radiomics
    Oubel, Estanislao
    Beaumont, Hubert
    Iannessi, Antoine
    MEDICAL IMAGING 2016: PACS AND IMAGING INFORMATICS: NEXT GENERATION AND INNOVATIONS, 2016, 9789
  • [2] Stopping rules for mutual information-based feature selection
    Mielniczuk, Jan
    Teisseyre, Pawel
    NEUROCOMPUTING, 2019, 358 : 255 - 274
  • [3] Mutual information-based feature selection for multilabel classification
    Doquire, Gauthier
    Verleysen, Michel
    NEUROCOMPUTING, 2013, 122 : 148 - 155
  • [4] A Study on Mutual Information-Based Feature Selection in Classifiers
    Arundhathi, B.
    Athira, A.
    Rajan, Ranjidha
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 479 - 486
  • [5] CONDITIONAL DYNAMIC MUTUAL INFORMATION-BASED FEATURE SELECTION
    Liu, Huawen
    Mo, Yuchang
    Zhao, Jianmin
    COMPUTING AND INFORMATICS, 2012, 31 (06) : 1193 - 1216
  • [6] Mutual Information-based Feature Selection from Set-valued Data
    Shu, Wenhao
    Qian, Wenbin
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 733 - 739
  • [7] Application of mutual information-based sequential feature selection to ISBSG mixed data
    Fernandez-Diego, Marta
    Gonzalez-Ladron-de-Guevara, Fernando
    SOFTWARE QUALITY JOURNAL, 2018, 26 (04) : 1299 - 1325
  • [8] Application of mutual information-based sequential feature selection to ISBSG mixed data
    Marta Fernández-Diego
    Fernando González-Ladrón-de-Guevara
    Software Quality Journal, 2018, 26 : 1299 - 1325
  • [9] Feature redundancy term variation for mutual information-based feature selection
    Gao, Wanfu
    Hu, Liang
    Zhang, Ping
    APPLIED INTELLIGENCE, 2020, 50 (04) : 1272 - 1288
  • [10] Feature redundancy term variation for mutual information-based feature selection
    Wanfu Gao
    Liang Hu
    Ping Zhang
    Applied Intelligence, 2020, 50 : 1272 - 1288