Graph clustering-based discretization of splitting and merging methods (GraphS and GraphM)

被引:20
|
作者
Sriwanna, Kittakorn [1 ]
Boongoen, Tossapon [1 ]
Iam-On, Natthakan [1 ]
机构
[1] Mae Fah Luang Univ, Sch Informat Technol, Phahon Yothin Rd, Muang 57100, Chiang Rai, Thailand
关键词
Multivariate discretization; Graph clustering; Normalized cuts; Normalized association; Data mining; ALGORITHM; TESTS;
D O I
10.1186/s13673-017-0103-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discretization plays a major role as a data preprocessing technique used in machine learning and data mining. Recent studies have focused on multivariate discretization that considers relations among attributes. The general goal of this method is to obtain the discrete data, which preserves most of the semantics exhibited by original continuous data. However, many techniques generate the final discrete data that may be less useful with natural groups of data not being maintained. This paper presents a novel graph clustering-based discretization algorithm that encodes different similarity measures into a graph representation of the examined data. The intuition allows more refined data-wise relations to be obtained and used with the effective graph clustering technique based on normalized association to discover nature graphs accurately. The goodness of this approach is empirically demonstrated over 30 standard datasets and 20 imbalanced datasets, compared with 11 well-known discretization algorithms using 4 classifiers. The results suggest the new approach is able to preserve the natural groups and usually achieve the efficiency in terms of classifier performance, and the desired number of intervals than the comparative methods.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] Graph clustering-based discretization approach to microarray data
    Kittakorn Sriwanna
    Tossapon Boongoen
    Natthakan Iam-On
    Knowledge and Information Systems, 2019, 60 : 879 - 906
  • [2] Graph clustering-based discretization approach to microarray data
    Sriwanna, Kittakorn
    Boongoen, Tossapon
    Iam-On, Natthakan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) : 879 - 906
  • [3] An evolutionary cut points search for graph clustering-based discretization
    Sriwanna, Kittakorn
    Boongoen, Tossapon
    Iam-On, Natthakan
    2016 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2016, : 514 - 519
  • [4] A clustering-based discretization for supervised learning
    Gupta, Ankit
    Mehrotra, Kishan G.
    Mohan, Chilukuri
    STATISTICS & PROBABILITY LETTERS, 2010, 80 (9-10) : 816 - 824
  • [5] Fuzzy clustering-based discretization for gene expression classification
    Kianmehr, Keivan
    Alshalalfa, Mohammed
    Alhajj, Reda
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 24 (03) : 441 - 465
  • [6] Fuzzy clustering-based discretization for gene expression classification
    Keivan Kianmehr
    Mohammed Alshalalfa
    Reda Alhajj
    Knowledge and Information Systems, 2010, 24 : 441 - 465
  • [7] Clustering-based Partitioning for Large Web Graphs
    Kong, Deyu
    Xie, Xike
    Zhang, Zhuoxu
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 593 - 606
  • [8] Lifted Marginal Filtering for Asymmetric Models by Clustering-Based Merging
    Luedtke, Stefan
    Gehrke, Marcel
    Braun, Tanya
    Mueller, Ralf
    Kirste, Thomas
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2608 - 2615
  • [9] A novel hierarchical clustering technique based on splitting and merging
    Senthilnath, J.
    Kumar, Deepak
    Benediktsson, J. A.
    Zhang, Xiaoyang
    INTERNATIONAL JOURNAL OF IMAGE AND DATA FUSION, 2016, 7 (01) : 19 - 41
  • [10] CLUSTERING-BASED METHODS FOR FAST EPITOME GENERATION
    Alain, Martin
    Guillemot, Christine
    Thoreau, Dominique
    Guillotel, Philippe
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 211 - 215