Empowering graph segmentation methods with SOMs and CONN similarity for clustering large and complex data

被引:0
|
作者
Erzsébet Merényi
Joshua Taylor
机构
[1] Rice University,Department of Statistics
[2] Rice University,Department of Electrical and Computer Engineering
来源
Neural Computing and Applications | 2020年 / 32卷
关键词
SOM clustering; Graph segmentation; CONN similarity; Big Data; Automation;
D O I
暂无
中图分类号
学科分类号
摘要
High-dimensional, large, and noisy data with complex structure challenge the limits of many clustering algorithms including modern graph segmentation methods. SOM-based clustering has been shown capable of capturing many clusters of widely varying statistical properties in such data. However, to date the best discovery results are produced by interactive extraction of clusters from informative SOM visualizations. This does not scale for Big Data, large archives, or near-real-time analyses. We approach this challenge by infusing SOM knowledge into leading automatic graph segmentation algorithms, which produce extremely poor results when segmenting the SOM prototypes without this information, and which would take a prohibitively long time to segment the input data sets. The knowledge translation occurs by casting the SOM prototypes as vertices and the CONN similarity measure as edge weightings of a graph which is then presented to graph segmentation algorithms. The resulting performance closely approximates the precision of the interactive SOM segmentation for complicated data and, at the same time, is extremely fast and memory-efficient. We demonstrate the effectiveness on a simple synthetic data set and on a very realistic fully labeled synthetic hyperspectral image. We also examine performance dependence on available parametrizations of the graph segmentation algorithms, in combination with parametrizations of the CONN similarity measure.
引用
收藏
页码:18161 / 18178
页数:17
相关论文
共 50 条
  • [1] Empowering graph segmentation methods with SOMs and CONN similarity for clustering large and complex data
    Merenyi, Erzsebet
    Taylor, Joshua
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (24): : 18161 - 18178
  • [2] SOM-empowered Graph Segmentation for Fast Automatic Clustering of Large and Complex Data
    Merenyi, Erzsebet
    Taylor, Joshua
    2017 12TH INTERNATIONAL WORKSHOP ON SELF-ORGANIZING MAPS AND LEARNING VECTOR QUANTIZATION, CLUSTERING AND DATA VISUALIZATION (WSOM), 2017, : 34 - 42
  • [3] Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets
    Bacelar-Nicolau, Helena
    Nicolau, Fernando
    Sousa, Aurga
    Bacelar-Nicolau, Leonor
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2009, 29 (02) : 9 - 18
  • [4] Unsupervised Image Segmentation based Graph Clustering Methods
    Gammoudil, Islem
    Mahjoub, Mohamed Ali
    Guerdelli, Fethi
    COMPUTACION Y SISTEMAS, 2020, 24 (03): : 969 - 987
  • [5] Combining Local Graph Clustering and Similarity Measure For Complex Detection
    Yu, Yang
    Lin, Lei
    Sun, Chengjie
    Wang, Xiaolong
    Wang, Xuan
    2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 2188 - 2192
  • [6] Evolution of SOMs' Structure and Learning Algorithm: From Visualization of High-Dimensional Data to Clustering of Complex Data
    Gorzalczany, Marian B.
    Rudzinski, Filip
    ALGORITHMS, 2020, 13 (05)
  • [7] Data Visualization & Clustering: Generative Topographic Mapping Similarity Assessment Allied to Graph Theory Clustering
    Escobar, Matheus de Souza
    Kaneko, Hiromasa
    Funatsu, Kimito
    FRONTIERS IN MOLECULAR DESIGN AND CHEMIAL INFORMATION SCIENCE - HERMAN SKOLNIK AWARD SYMPOSIUM 2015: JURGEN BAJORATH, 2016, 1222 : 175 - 210
  • [8] Determination of similarity threshold in clustering problems for large data sets
    Sánchez-Díaz, G
    Martínez-Trinidad, JF
    PROGRESS IN PATTERN RECOGNITION, SPEECH AND IMAGE ANALYSIS, 2003, 2905 : 611 - 618
  • [9] Gene expression data clustering based on graph regularized subspace segmentation
    Chen, Xiaoyun
    Jian, Cairen
    NEUROCOMPUTING, 2014, 143 : 44 - 50
  • [10] Two-Phase and Graph-Based Clustering Methods for Accurate and Efficient Segmentation of Large Mass Spectrometry Images
    Dexter, Alex
    Race, Alan M.
    Steven, Rory T.
    Barnes, Jennifer R.
    Hulme, Heather
    Goodwin, Richard J. A.
    Styles, Iain B.
    Bunch, Josephine
    ANALYTICAL CHEMISTRY, 2017, 89 (21) : 11293 - 11300