An Evolutionary Algorithm for Feature Selective Double Clustering of Text Documents

被引:0
|
作者
Nourashrafeddin, S. N. [1 ]
Milios, Evangelos [1 ]
Arnold, Dirk V. [1 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 4R2, Canada
关键词
Genetic algorithm; co-clustering; multiobjective optimization; text clustering; INFORMATION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We propose FSDC, an evolutionary algorithm for Feature Selective Double Clustering of text documents. We first cluster the terms existing in the document corpus. The term clusters are then fed into multiobjective genetic algorithms to prune non- informative terms and form sets of keyterms representing topics. Based on the topic keyterms found, representative documents for each topic are extracted. These documents are then used as seeds to cluster all documents in the dataset. FSDC is compared to some well- known co- clusterers on real text datasets. The experimental results show that our algorithm can outperform the competitors.
引用
收藏
页码:446 / 453
页数:8
相关论文
共 50 条
  • [31] Pseudo-supervised clustering for text documents
    Maggini, M
    Rigutini, L
    Turchi, M
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 363 - 369
  • [32] Explainable Graph Spectral Clustering of text documents
    Starosta, Bartlomiej
    Klopotek, Mieczyslaw A.
    Wierzchon, Slawomir T.
    Czerski, Dariusz
    Sydow, Marcin
    Borkowski, Piotr
    PLOS ONE, 2025, 20 (02):
  • [33] An Innovative Approach to classify and Retrieve Text Documents Using Feature Extraction and Hierarchical Clustering Based on Ontology
    Patil, Aradhana R.
    Manjrekar, Amrita A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 371 - 376
  • [34] Local Feature Selection in Text Clustering
    Ribeiro, Marcelo N.
    Neto, Manoel J. R.
    Prudencio, Ricardo B. C.
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT II, 2009, 5507 : 45 - +
  • [35] Correlation Clustering Based on Genetic Algorithm for Documents Clustering
    Zhang, Zhenya
    Cheng, Hongmei
    Chen, Wanli
    Zhang, Shuguang
    Fang, Qiansheng
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 3193 - +
  • [36] Clustering Aggregation Based on Genetic Algorithm for Documents Clustering
    Zhang, Zhenya
    Cheng, Hongmei
    Zhang, Shuguang
    Chen, Wanli
    Fang, Qiansheng
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 3156 - +
  • [37] Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums
    Bouabdallaoui, Ibrahim
    Guerouate, Fatima
    Sbihi, Mohammed
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2024, 13
  • [38] Clustering Algorithm on Block Division of Documents
    Liu, Gang
    Luo, Mingyue
    2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [39] ACONS:: A new algorithm for clustering documents
    Alonso, Andres Gago
    Suarez, Airel Perez
    Pagola, Jose E. Medina
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 664 - 673
  • [40] An evolutionary data clustering algorithm
    Aguilar, Jose
    PROCEEDING OF THE 11TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS: COMPUTER SCIENCE AND TECHNOLOGY, VOL 4, 2007, : 7 - +