Scalable spectral clustering with cosine similarity

被引:0
|
作者
Chen, Guangliang [1 ]
机构
[1] San Jose State Univ, Dept Math & Stat, San Jose, CA 95192 USA
关键词
DATA SETS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a unified scalable computing framework for three versions of spectral clustering - Normalized Cut (Shi and Malik, 2000), the Ng-Jordan-Weiss (NJW) algorithm (2001), and Diffusion Maps (Coifman and Lafon, 2006), in the setting of cosine similarity. We assume that the input data is either sparse (e.g., as a document-term frequency matrix) or of only a few hundred dimensions (e.g., for small images or data obtained through PCA). We show that in such cases, spectral clustering can be implemented solely based on efficient operations on the data matrix such as elementwise manipulation, matrix-vector multiplication and low-rank SVD, thus entirely avoiding the weight matrix. Our algorithm is simple to implement, fast to run, accurate and robust to outliers. We demonstrate its superior performance through extensive experiments which compare our scalable algorithm with the plain implementation on several benchmark data sets.
引用
收藏
页码:314 / 319
页数:6
相关论文
共 50 条
  • [31] Evolving Cauchy possibilistic clustering based on cosine similarity for monitoring cyber systems
    Skrjanc, Igor
    Sanchis de Miguel, Araceli
    Antonio Iglesias, Jose
    Ledezma, Agapito
    Dovzan, Dejan
    PROCEEDINGS OF THE 2017 EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2017,
  • [32] Enhancement of Performance of Document Clustering in the Authorship Identification Problem with a Weighted Cosine Similarity
    Martin-del-Campo-Rodriguez, Carolina
    Sidorov, Grigori
    Batyrshin, Ildar
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2018, PT II, 2018, 11289 : 49 - 56
  • [33] Image Cosegmentation via Saliency-Guided Constrained Clustering with Cosine Similarity
    Tao, Zhiqiang
    Liu, Hongfu
    Fu, Huazhu
    Fu, Yun
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4285 - 4291
  • [34] Scalable Spectral Clustering Using Random Binning Features
    Wu, Lingfei
    Chen, Pin-Yu
    Yen, Ian En-Hsu
    Xu, Fangli
    Xia, Yinglong
    Aggarwal, Charu
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2506 - 2515
  • [35] Simple and Scalable Constrained Clustering: A Generalized Spectral Method
    Cucuringu, Mihai
    Koutis, Ioannis
    Chawla, Sanjay
    Miller, Gary
    Peng, Richard
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 445 - 454
  • [36] Learning similarity with cosine similarity ensemble
    Xia, Peipei
    Zhang, Li
    Li, Fanzhang
    INFORMATION SCIENCES, 2015, 307 : 39 - 52
  • [37] Improved Spectral Clustering Algorithm Based on Similarity Measure
    Yan, Jun
    Cheng, Debo
    Zong, Ming
    Deng, Zhenyun
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2014, 2014, 8933 : 641 - 654
  • [38] A parameter-free similarity graph for spectral clustering
    Inkaya, Tulin
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (24) : 9489 - 9498
  • [39] Spectral clustering with adaptive similarity measure in Kernel space
    Ye, Xiucai
    Sakurai, Tetsuya
    INTELLIGENT DATA ANALYSIS, 2018, 22 (04) : 751 - 765
  • [40] A similarity measure based on subspace distance for spectral clustering
    Naseri, Nadimeh
    Eftekhari, Mahdi
    Saberi-Movahed, Farid
    Radjabalipour, Mehdi
    Belanche, Lluis A.
    NEUROCOMPUTING, 2025, 620