Quasi-cluster centers clustering algorithm based on potential entropy and t-distributed stochastic neighbor embedding

被引:5
|
作者
Fang, Xian [1 ]
Tie, Zhixin [1 ]
Guan, Yinan [1 ]
Rao, Shanshan [1 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Data clustering; Quasi-cluster centers clustering; Potential entropy; Optimal parameter; t-distributed stochastic neighbor embedding; DENSITY PEAKS; FAST SEARCH; FIND; REDUCTION; ROCK;
D O I
10.1007/s00500-018-3221-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A novel density-based clustering algorithm named QCC is presented recently. Although the algorithm has proved its strong robustness, it is still necessary to manually determine the two input parameters, including the number of neighbors (k) and the similarity threshold value (), which severely limits the promotion of the algorithm. In addition, the QCC does not perform excellently when confronting the datasets with relatively high dimensions. To overcome these defects, firstly, we define a new method for computing local density and introduce the strategy of potential entropy into the original algorithm. Based on this idea, we propose a new QCC clustering algorithm (QCC-PE). QCC-PE can automatically extract optimal value of the parameter k by optimizing potential entropy of data field. By this means, the optimized parameter can be calculated from the datasets objectively rather than the empirical estimation accumulated from a large number of experiments. Then, t-distributed stochastic neighbor embedding (tSNE) is applied to the model of QCC-PE and further brings forward a method based on tSNE (QCC-PE-tSNE), which preprocesses high-dimensional datasets by dimensionality reduction technique. We compare the performance of the proposed algorithms with QCC, DBSCAN, and DP in the synthetic datasets, Olivetti Face Database, and real-world datasets respectively. Experimental results show that our algorithms are feasible and effective and can often outperform the comparisons.
引用
收藏
页码:5645 / 5657
页数:13
相关论文
共 50 条
  • [21] Fault diagnosis of industrial process based on the optimal parametric t-distributed stochastic neighbor embedding
    Ruixue Jia
    Jing Wang
    Jinglin Zhou
    Science China Information Sciences, 2021, 64
  • [22] Fault diagnosis of industrial process based on the optimal parametric t-distributed stochastic neighbor embedding
    Jia, Ruixue
    Wang, Jing
    Zhou, Jinglin
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)
  • [23] Process Data Visualization Using Bikernel t-Distributed Stochastic Neighbor Embedding
    Zhang, Haili
    Wang, Pu
    Gao, Xuejin
    Qi, Yongsheng
    Gao, Huihui
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2020, 59 (44) : 19623 - 19632
  • [24] Fault diagnosis of industrial process based on the optimal parametric t-distributed stochastic neighbor embedding
    Ruixue JIA
    Jing WANG
    Jinglin ZHOU
    ScienceChina(InformationSciences), 2021, 64 (05) : 233 - 235
  • [25] t-Distributed Stochastic Neighbor Embedding Method with the Least Information Loss for Macromolecular Simulations
    Zhou, Hongyu
    Wang, Feng
    Tao, Peng
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2018, 14 (11) : 5499 - 5510
  • [26] Chatter Detection Approach Based on Wavelet Synchrosqueezing and t-Distributed Stochastic Neighbor Embedding for a Turning Process
    Kuo, Ping-Huan
    Lin, Po-Lun
    Yau, Her-Terng
    IEEE SENSORS JOURNAL, 2024, 24 (07) : 9660 - 9670
  • [27] Monitoring of papermaking wastewater treatment processes using t-distributed stochastic neighbor embedding
    Ma, Xiaobo
    Zhang, Yuchen
    Zhang, Fengshan
    Liu, Hongbin
    JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING, 2021, 9 (06):
  • [28] Persistent-Homology-Based Microstructural Optimization of Materials Using t-Distributed Stochastic Neighbor Embedding
    Wang, Zhi-Lei
    Ogawa, Toshio
    Adachi, Yoshitaka
    ADVANCED THEORY AND SIMULATIONS, 2020, 3 (07)
  • [29] Establishment and Application of Steel Composition Prediction Model Based on t-Distributed Stochastic Neighbor Embedding (t-SNE) Dimensionality Reduction Algorithm
    Liu, Xin
    Bao, Yanping
    Zhao, Lihua
    Gu, Chao
    JOURNAL OF SUSTAINABLE METALLURGY, 2024, 10 (02) : 509 - 524
  • [30] Industrial process data visualization based on a deep enhanced t-distributed stochastic neighbor embedding neural network
    Lu, Weipeng
    Yan, Xuefeng
    ASSEMBLY AUTOMATION, 2022, 42 (02) : 268 - 277