A new feature selection method to improve the document clustering using particle swarm optimization algorithm

被引:363
|
作者
Abualigah, Laith Mohammad [1 ]
Khader, Ahamad Tajudin [1 ]
Hanandeh, Essam Said [2 ]
机构
[1] Univ Sains Malaysia, Sch Comp Sci, George Town 11800, Malaysia
[2] Zarqa Univ, Dept Comp Informat Syst, POB 13132, Zarqa, Jordan
关键词
Unsupervised feature selection; Informative features; Particle swarm optimization algorithm; K-mean text clustering algorithm; DIMENSION REDUCTION;
D O I
10.1016/j.jocs.2017.07.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The large amount of text information on the Internet and in modern applications makes dealing with this volume of information complicated. The text clustering technique is an appropriate tool to deal with an enormous amount of text documents by grouping these documents into coherent groups. The document size decreases the effectiveness of the text clustering technique. Subsequently, text documents contain sparse and uninformative features (i.e., noisy, irrelevant, and unnecessary features), which affect the effectiveness of the text clustering technique. The feature selection technique is a primary unsupervised learning method employed to select the informative text features to create a new subset of a document's features. This method is used to increase the effectiveness of the underlying clustering algorithm. Recently, several complex optimization problems have been successfully solved using meta heuristic algorithms. This paper proposes a novel feature selection method, namely, feature selection method using the particle swarm optimization (PSO) algorithm (FSPSOTC) to solve the feature selection problem by creating a new subset of informative text features. This new subset of features can improve the performance of the text clustering technique and reduce the computational time. Experiments were conducted using six standard text datasets with several characteristics. These datasets are commonly used in the domain of the text clustering. The results revealed that the proposed method (FSPSOTC) enhanced the effectiveness of the text clustering technique by dealing with a new subset of informative features. The proposed method is compared with the other well-known algorithms i.e., feature selection method using a genetic algorithm to improve the text clustering (FSGATC), and feature selection method using the harmony search algorithm to improve the text clustering (FSHSTC) in the text feature selection. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:456 / 466
页数:11
相关论文
共 50 条
  • [21] A new particle swarm feature selection method for classification
    Chen, Kun-Huang
    Chen, Li-Fei
    Su, Chao-Ton
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2014, 42 (03) : 507 - 530
  • [22] A Novel Feature Selection Algorithm using Particle Swarm Optimization for Cancer Microarray Data
    Sahu, Barnali
    Mishra, Debahuti
    INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 27 - 31
  • [23] The Optimization of Feature Selection Based on Chaos Clustering Strategy and Niche Particle Swarm Optimization
    Duan, Longzhen
    Yang, Shuqing
    Zhang, Dongbo
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [24] New clustering method based on particle swarm algorithm
    School of Electronics and Information, Jiangsu University of Science and Technology, Zhenjiang 212003, China
    不详
    不详
    Nanjing Hangkong Hangtian Daxue Xuebao, 2006, SUPPL. (62-65):
  • [25] A New Binary Particle Swarm Optimisation Algorithm for Feature Selection
    Xue, Bing
    Nguyen, Su
    Zhang, Mengjie
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, 2014, 8602 : 501 - 513
  • [26] Feature Weighting for Clustering by Particle Swarm Optimization
    Swetha, K. P.
    Devi, V. Susheela
    2012 SIXTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING (ICGEC), 2012, : 441 - 444
  • [27] An hybrid particle swarm optimization with crow search algorithm for feature selection
    Adamu, Abdulhameed
    Abdullahi, Mohammed
    Junaidu, Sahalu Balarabe
    Hassan, Ibrahim Hayatu
    MACHINE LEARNING WITH APPLICATIONS, 2021, 6
  • [28] Hybrid particle swarm optimization algorithm for text feature selection problems
    Nachaoui, Mourad
    Lakouam, Issam
    Hafidi, Imad
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7471 - 7489
  • [29] Binary Particle Swarm Optimization based Algorithm for Feature Subset Selection
    Chakraborty, Basabi
    ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, : 145 - 148
  • [30] Feature selection algorithm based on bare bones particle swarm optimization
    Zhang, Yong
    Gong, Dunwei
    Hu, Ying
    Zhang, Wanqiu
    NEUROCOMPUTING, 2015, 148 : 150 - 157