Graph clustering-based discretization of splitting and merging methods (GraphS and GraphM)

被引:20
|
作者
Sriwanna, Kittakorn [1 ]
Boongoen, Tossapon [1 ]
Iam-On, Natthakan [1 ]
机构
[1] Mae Fah Luang Univ, Sch Informat Technol, Phahon Yothin Rd, Muang 57100, Chiang Rai, Thailand
关键词
Multivariate discretization; Graph clustering; Normalized cuts; Normalized association; Data mining; ALGORITHM; TESTS;
D O I
10.1186/s13673-017-0103-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discretization plays a major role as a data preprocessing technique used in machine learning and data mining. Recent studies have focused on multivariate discretization that considers relations among attributes. The general goal of this method is to obtain the discrete data, which preserves most of the semantics exhibited by original continuous data. However, many techniques generate the final discrete data that may be less useful with natural groups of data not being maintained. This paper presents a novel graph clustering-based discretization algorithm that encodes different similarity measures into a graph representation of the examined data. The intuition allows more refined data-wise relations to be obtained and used with the effective graph clustering technique based on normalized association to discover nature graphs accurately. The goodness of this approach is empirically demonstrated over 30 standard datasets and 20 imbalanced datasets, compared with 11 well-known discretization algorithms using 4 classifiers. The results suggest the new approach is able to preserve the natural groups and usually achieve the efficiency in terms of classifier performance, and the desired number of intervals than the comparative methods.
引用
收藏
页数:39
相关论文
共 50 条
  • [21] A clustering-based review on project portfolio optimization methods
    Saiz, Miguel
    Lostumbo, Marisa A.
    Juan, Angel A.
    Lopez-Lopez, David
    INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH, 2022, 29 (01) : 172 - 199
  • [22] Dissimilarity Space Embedding of Labeled Graphs by a Clustering-based Compression Procedure
    Livi, Lorenzo
    Bianchi, Filippo Maria
    Rizzi, Antonello
    Sadeghian, Alireza
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [23] A New Tool for Merging the Information Based on Clustering Methods
    Torres Garfias, Jose Miguel
    Orantes Molina, Antonio
    Linares Flores, Jesus
    Barahona Avalos, M. C. Jorge L.
    2011 IEEE ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE (CERMA 2011), 2011, : 155 - 160
  • [24] Scalable Collaborative Filtering Based on Splitting-Merging Clustering Algorithm
    Belacel, Nabil
    Durand, Guillaume
    Leger, Serge
    Bouchard, Cajetan
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2018, 2019, 11352 : 290 - 311
  • [25] Clustering-based Safety Grouping Strategy for Bipartite Graph Data Publishing
    Luo, Yongcheng
    Le, Jiajin
    Jiang, Yaqian
    Chen, Dehua
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (12A): : 5387 - 5394
  • [26] Graph clustering-based crowd counting with very limited labelled samples
    Wang, Huake
    Zhang, Kaibing
    Su, Zebin
    Lu, Jian
    Xiong, Zenggang
    ELECTRONICS LETTERS, 2020, 56 (14) : 709 - +
  • [27] A clustering-based approach for classifying data streams using graph matching
    Du, Yuxin
    He, Mingshu
    Wang, Xiaojuan
    JOURNAL OF BIG DATA, 2025, 12 (01)
  • [28] Rethinking Personalized Federated Learning with Clustering-Based Dynamic Graph Propagation
    Wang, Jiaqi
    Chen, Yuzhong
    Wu, Yuhang
    Das, Mahashweta
    Yang, Hao
    Ma, Fenglong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 155 - 167
  • [29] Graph Clustering-based Ensemble Method for Handwritten Text Line Segmentation
    Manohar, Vasant
    Vitaladevuni, Shiv N.
    Cao, Huaigu
    Prasad, Rohit
    Natarajan, Prem
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 574 - 578
  • [30] An Experimental Study on Unsupervised Clustering-based Feature Selection Methods
    Covoes, Thiago F.
    Hruschka, Eduardo R.
    2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 993 - 1000