Scalable Sequential Spectral Clustering

被引:0
|
作者
Li, Yeqing [1 ]
Huang, Junzhou [1 ]
Liu, Wei [2 ]
机构
[1] Univ Texas Arlington, Arlington, TX 76019 USA
[2] Didi Res, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one significant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot fit in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, e.g., memory. We begin with investigating an effective way of approximating the graph affinity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efficient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-the-arts.
引用
收藏
页码:1809 / 1815
页数:7
相关论文
共 50 条
  • [41] Integrating clustering and sequential analysis for improving the spectral density estimation and dependency structure of time series
    Laala, Barkahoum
    Elsawah, A. M.
    Vishwakarma, Gajendra K.
    Fang, Kai-Tai
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025, 54 (03) : 889 - 924
  • [42] Scalable feature mining for sequential data
    Lesh, Neal
    Zaki, Mohammed J.
    Ogihara, Mitsunori
    IEEE Intelligent Systems and Their Applications, 2000, 15 (02): : 48 - 56
  • [43] Scalable feature mining for sequential data
    Lesh, N
    Zaki, MJ
    Ogihara, M
    IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2000, 15 (02): : 48 - 56
  • [44] A scalable sequential pattern mining algorithm
    Wang, Jiahong
    Asanuma, Yoshiaki
    Kodama, Eiichiro
    Takata, Toyoo
    2006 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1-3, 2006, : 437 - +
  • [45] GoSCAN: Decentralized scalable data clustering
    Mashayekhi, Hoda
    Habibi, Jafar
    Voulgaris, Spyros
    van Steen, Maarten
    COMPUTING, 2013, 95 (09) : 759 - 784
  • [46] LIMBO: Scalable clustering of categorical data
    Andritsos, P
    Tsaparas, P
    Miller, RJ
    Sevcik, KC
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2004, PROCEEDINGS, 2004, 2992 : 123 - 146
  • [47] Scalable clustering using graphics processors
    Cao, Feng
    Tung, Anthony K. H.
    Zhou, Aoying
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2006, 4016 : 372 - 384
  • [48] Scalable Bootstrap Clustering for Massive Data
    Wang, Haocheng
    Zhuang, Fuzhen
    Ao, Xiang
    He, Qing
    Shi, Zhongzhi
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 123 - 128
  • [49] Scalable Clustering with Adaptive Instance Sampling
    Yang, JaeKyung
    Yu, ByoungJin
    Choi, MyoungJin
    2013 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM 2013), 2013, : 1309 - 1313
  • [50] Structured Graph Reconstruction for Scalable Clustering
    Han, Junwei
    Xiong, Kai
    Nie, Feiping
    Li, Xuelong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 2252 - 2265