CFOND: Consensus Factorization for Co-Clustering Networked Data

被引:32
|
作者
Guo, Ting [1 ]
Pan, Shirui [2 ]
Zhu, Xingquan [3 ]
Zhang, Chengqi [2 ]
机构
[1] CSIRO, Data61, Sydney, NSW 2015, Australia
[2] Univ Technol Sydney, Fac Engn & Informat Technol, Ctr Artificial Intelligence, Sydney, NSW 2007, Australia
[3] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
关键词
Networked data; networks; co-clustering; topology; nonnegative matrix factorization; ALGORITHMS;
D O I
10.1109/TKDE.2018.2846555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Networked data are common in domains where instances are characterized by both feature values and inter-dependency relationships. Finding cluster structures for networked instances and discovering representative features for each cluster represent a special co-clustering task usefully for many real-world applications, such as automatic categorization of scientific publications and finding representative key-words for each cluster. To date, although co-clustering has been commonly used for finding clusters for both instances and features, all existing methods are focused on instance-feature values, without leveraging valuable topology relationships between instances to help boost co-clustering performance. In this paper, we propose CFOND, a consensus factorization based framework for co-clustering networked data. We argue that feature values and linkages provide useful information from different perspectives, but they are not always consistent and therefore need to be carefully aligned for best clustering results. In the paper, we advocate a consensus factorization principle, which simultaneously factorizes information from three aspects: network topology structures, instance-feature content relationships, and feature-feature correlations. The consensus factorization ensures that the final cluster structures are consistent across information from the three aspects with minimum errors. Experiments on real-life networks validate the performance of our algorithm.
引用
收藏
页码:706 / 719
页数:14
相关论文
共 50 条
  • [31] Co-Clustering on Manifolds
    Gu, Quanquan
    Zhou, Jie
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 359 - 367
  • [32] Directional co-clustering
    Salah, Aghiles
    Nadif, Mohamed
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (03) : 591 - 620
  • [33] Graph dual regularization non-negative matrix factorization for co-clustering
    Shang, Fanhua
    Jiao, L. C.
    Wang, Fei
    PATTERN RECOGNITION, 2012, 45 (06) : 2237 - 2250
  • [34] Word Co-Occurrence Regularized Non-Negative Matrix Tri-Factorization for Text Data Co-Clustering
    Salah, Aghiles
    Ailem, Melissa
    Nadif, Mohamed
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3992 - 3999
  • [35] Bayesian co-clustering
    Domeniconi, Carlotta
    Laskey, Kathryn
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2015, 7 (05) : 347 - 356
  • [36] Model-based co-clustering for ordinal data
    Jacques, Julien
    Biernacki, Christophe
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 123 : 101 - 115
  • [37] Model-based co-clustering for functional data
    Ben Slimen, Yosra
    Allio, Sylvain
    Jacques, Julien
    NEUROCOMPUTING, 2018, 291 : 97 - 108
  • [38] Bipartite isoperimetric graph partitioning for data co-clustering
    Rege, Manjeet
    Dong, Ming
    Fotouhi, Farshad
    DATA MINING AND KNOWLEDGE DISCOVERY, 2008, 16 (03) : 276 - 312
  • [39] A New Framework for Co-clustering of Gene Expression Data
    Zhang, Shuzhong
    Wang, Kun
    Chen, Bilian
    Huang, Xiuzhen
    PATTERN RECOGNITION IN BIOINFORMATICS, 2011, 7036 : 1 - +
  • [40] Bipartite isoperimetric graph partitioning for data co-clustering
    Manjeet Rege
    Ming Dong
    Farshad Fotouhi
    Data Mining and Knowledge Discovery, 2008, 16 : 276 - 312