A Clustering-Based Approach to Reduce Feature Redundancy

被引：1

作者：

de Amorim, Renato Cordeiro ^{[1
]}

Mirkin, Boris ^{[2
]}

机构：

[1] Univ Hertfordshire, Sch Comp Sci, Coll Lane Campus, Hatfield AL10 9AB, Herts, England

[2] Birkbeck Univ London, Dept Comp Sci & Informat Syst, Malet St, London WC1E 7HX, England

来源：

KNOWLEDGE, INFORMATION AND CREATIVITY SUPPORT SYSTEMS: RECENT TRENDS, ADVANCES AND SOLUTIONS, KICSS 2013 | 2016年 / 364卷

关键词：

Unsupervised feature selection; Feature weighting; Redundant features; Clustering; Mental task separation; FEATURE-SELECTION; VARIABLES;

D O I：

10.1007/978-3-319-19090-7_35

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.

引用

页码：465 / 475

页数：11

共 50 条

[41] Graph clustering-based discretization approach to microarray data
Sriwanna, Kittakorn
Boongoen, Tossapon
Iam-On, Natthakan
KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) : 879 - 906
[42] Cognitive Profiling for Job Recruitments: A Clustering-Based Approach
Verma, Asmita
Deep, Prakhar
Aman, Kushagra
Khemchandani, Vineeta
Chandra, Sushil
Sharma, Greeshma
2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021, : 604 - 608
[43] A clustering-based obstacle segmentation approach for urban environments
Ridel, Daniela A.
Shinzato, Patrick Y.
Wolf, Denis F.
2015 12TH LATIN AMERICAN ROBOTICS SYMPOSIUM AND 2015 3RD BRAZILIAN SYMPOSIUM ON ROBOTICS (LARS-SBR), 2015, : 265 - 270
[44] A clustering-based approach for mining dockerfile evolutionary trajectories
Yang ZHANG
Huaimin WANG
Vladimir FILKOV
Science China(Information Sciences), 2019, 62 (01) : 211 - 213
[45] A Clustering-based Approach to Web Image Context Extraction
Alcic, Sadet
Conrad, Stefan
PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCES ON ADVANCES IN MULTIMEDIA (MMEDIA 2011), 2011, : 74 - 79
[46] A clustering-based approach for the evaluation of candidate emerging technologies
Serkan Altuntas
Zulfiye Erdogan
Turkay Dereli
Scientometrics, 2020, 124 : 1157 - 1177
[47] A mixed clustering-based approach for a territorial hydrological regionalization
Oumaima Rami
Moulay Driss Hasnaoui
Driss Ouazar
Ahmed Bouziane
Arabian Journal of Geosciences, 2022, 15 (1)
[48] A Clustering-Based Approach for Exploring Sequences of Compiler Optimizations
Martins, Luiz G. A.
Nobre, Ricardo
Delbem, Alexandra C. B.
Marques, Eduardo
Cardoso, Joao M. P.
2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2014, : 2436 - 2443
[49] LQG Control of Large Networks: A Clustering-Based Approach
Xue, Nan
Chakrabortty, Aranya
2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 2333 - 2338
[50] Towards Exploratory Relationship Search: A Clustering-Based Approach
Zhang, Yanan
Cheng, Gong
Qu, Yuzhong
SEMANTIC TECHNOLOGY, 2014, 8388 : 277 - 293

← 1 2 3 4 5 →