Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data

被引:0
|
作者
Dinari, Or [1 ]
Freifeld, Oren [1 ]
机构
[1] Ben Gurion Univ Negev, Beer Sheva, Israel
基金
以色列科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Practical tools for clustering streaming data must be fast enough to handle the arrival rate of the observations. Typically, they also must adapt on the fly to possible lack of stationarity; i.e., the data statistics may be time-dependent due to various forms of drifts, changes in the number of clusters, etc. The Dirichlet Process Mixture Model (DPMM), whose Bayesian nonparametric nature allows it to adapt its complexity to the data, seems a natural choice for the streaming-data case. In its classical formulation, however, the DPMM cannot capture common types of drifts in the data statistics. Moreover, and regardless of that limitation, existing methods for online DPMM inference are too slow to handle rapid data streams. In this work we propose adapting both the DPMM and a known DPMM sampling-based non-streaming inference method for streaming-data clustering. We demonstrate the utility of the proposed method on several challenging settings, where it obtains state-of-the-art results while being on par with other methods in terms of speed.
引用
收藏
页码:818 / 835
页数:18
相关论文
共 50 条
  • [41] Distributed Inference for Dirichlet Process Mixture Models
    Ge, Hong
    Chen, Yutian
    Wan, Moquan
    Ghahramani, Zoubin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2276 - 2284
  • [42] DIRICHLET PROCESS MIXTURE MODELS WITH MULTIPLE MODALITIES
    Paisley, John
    Carin, Lawrence
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1613 - 1616
  • [43] Background Subtraction with Dirichlet Process Mixture Models
    Haines, Tom S. F.
    Xiang, Tao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (04) : 670 - 683
  • [44] Collapsed Variational Dirichlet Process Mixture Models
    Kurihara, Kenichi
    Welling, Max
    Teh, Yee Whye
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2796 - 2801
  • [45] Clustering compositional data using Dirichlet mixture model
    Pal, Samyajoy
    Heumann, Christian
    PLOS ONE, 2022, 17 (05):
  • [46] Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models
    Yoshida, Minoru
    Sato, Issei
    Nakagawa, Hiroshi
    Terada, Akira
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS, 2010, 6119 : 230 - +
  • [47] Hybrid Dirichlet mixture models for functional data
    Petrone, Sonia
    Guindani, Michele
    Gelfand, Alan E.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 755 - 782
  • [48] A Dirichlet Process Mixture Model for Spherical Data
    Straub, Julian
    Chang, Jason
    Freifeld, Oren
    Fisher, John W., III
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 930 - 938
  • [49] Dirichlet Process Mixture Model for Document Clustering with Feature Partition
    Huang, Ruizhang
    Yu, Guan
    Wang, Zhaojun
    Zhang, Jun
    Shi, Liangxing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (08) : 1748 - 1759
  • [50] The nested joint clustering via Dirichlet process mixture model
    Han, Shengtong
    Zhang, Hongmei
    Sheng, Wenhui
    Arshad, Hasan
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (05) : 815 - 830