An NMF-framework for Unifying Posterior Probabilistic Clustering and Probabilistic Latent Semantic Indexing

被引:1
|
作者
Zhang, Zhong-Yuan [1 ]
Li, Tao [2 ]
Ding, Chris [3 ]
Tang, Jie [4 ]
机构
[1] Cent Univ Finance & Econ, Sch Math & Stat, Beijing, Peoples R China
[2] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[3] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[4] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Posterior probabilistic clustering; Probabilistic latent semantic indexing; NMF-framework; MATRIX FACTORIZATION;
D O I
10.1080/03610926.2012.714034
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In document clustering, a document may be assigned to multiple clusters and the probabilities of a document belonging to different clusters are directly normalized. We propose a new Posterior Probabilistic Clustering (PPC) model that has this normalization property. The clustering model is based on Nonnegative Matrix Factorization (NMF) and flexible such that if we use class conditional probability normalization, the model reduces to Probabilistic Latent Semantic Indexing (PLSI). Systematic comparison and evaluation indicates that PPC is competitive with other state-of-art clustering methods. Furthermore, the results of PPC are more sparse and orthogonal, both of which are highly desirable.
引用
收藏
页码:4011 / 4024
页数:14
相关论文
共 50 条
  • [21] On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing
    Ding, Chris
    Li, Tao
    Peng, Wei
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (08) : 3913 - 3927
  • [22] An Indexing Framework for Queries on Probabilistic Graphs
    Maniu, Silviu
    Cheng, Reynold
    Senellart, Pierre
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2017, 42 (02):
  • [23] The Hierarchical Clustering Analysis of Hyperspectral Image Based on Probabilistic Latent Semantic Analysis
    Yi Wen-bin
    Shen Li
    Qi Yin-feng
    Tang Hong
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2011, 31 (09) : 2471 - 2475
  • [24] Fuzzy DA Clustering-Based Improvement of Probabilistic Latent Semantic Analysis
    Goshima, Takafumi
    Honda, Katsuhiro
    Ubukata, Seiki
    Notsu, Akira
    INTEGRATED UNCERTAINTY IN KNOWLEDGE MODELLING AND DECISION MAKING, IUKM 2016, 2016, 9978 : 175 - 184
  • [25] Probabilistic latent clustering of device usage
    Andreoli, JM
    Bouchard, G
    ADVANCES IN INTELLIGENT DATA ANALYSIS VI, PROCEEDINGS, 2005, 3646 : 1 - 11
  • [26] A Probabilistic Framework for Relational Clustering
    Long, Bo
    Zhang, Zhongfei
    Yu, Philip S.
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 470 - 479
  • [27] A probabilistic framework for graph clustering
    Luo, B
    Robles-Kelly, A
    Torsello, A
    Wilson, RC
    Hancock, ER
    2001 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2001, : 912 - 919
  • [28] A unified statistical approach to non-negative matrix factorization and probabilistic latent semantic indexing
    Devarajan, Karthik
    Wang, Guoli
    Ebrahimi, Nader
    MACHINE LEARNING, 2015, 99 (01) : 137 - 163
  • [29] Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing
    Emily Chia-Yu Su
    Jia-Ming Chang
    Cheng-Wei Cheng
    Ting-Yi Sung
    Wen-Lian Hsu
    BMC Bioinformatics, 13
  • [30] A unified statistical approach to non-negative matrix factorization and probabilistic latent semantic indexing
    Karthik Devarajan
    Guoli Wang
    Nader Ebrahimi
    Machine Learning, 2015, 99 : 137 - 163