CFOND: Consensus Factorization for Co-Clustering Networked Data

被引:32
|
作者
Guo, Ting [1 ]
Pan, Shirui [2 ]
Zhu, Xingquan [3 ]
Zhang, Chengqi [2 ]
机构
[1] CSIRO, Data61, Sydney, NSW 2015, Australia
[2] Univ Technol Sydney, Fac Engn & Informat Technol, Ctr Artificial Intelligence, Sydney, NSW 2007, Australia
[3] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA
关键词
Networked data; networks; co-clustering; topology; nonnegative matrix factorization; ALGORITHMS;
D O I
10.1109/TKDE.2018.2846555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Networked data are common in domains where instances are characterized by both feature values and inter-dependency relationships. Finding cluster structures for networked instances and discovering representative features for each cluster represent a special co-clustering task usefully for many real-world applications, such as automatic categorization of scientific publications and finding representative key-words for each cluster. To date, although co-clustering has been commonly used for finding clusters for both instances and features, all existing methods are focused on instance-feature values, without leveraging valuable topology relationships between instances to help boost co-clustering performance. In this paper, we propose CFOND, a consensus factorization based framework for co-clustering networked data. We argue that feature values and linkages provide useful information from different perspectives, but they are not always consistent and therefore need to be carefully aligned for best clustering results. In the paper, we advocate a consensus factorization principle, which simultaneously factorizes information from three aspects: network topology structures, instance-feature content relationships, and feature-feature correlations. The consensus factorization ensures that the final cluster structures are consistent across information from the three aspects with minimum errors. Experiments on real-life networks validate the performance of our algorithm.
引用
收藏
页码:706 / 719
页数:14
相关论文
共 50 条
  • [1] Collective Matrix Factorization for Co-clustering
    Sachan, Mrinmaya
    Srivastava, Shashank
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 93 - 94
  • [2] An Overview of Co-Clustering via Matrix Factorization
    Lin, Renjie
    Wang, Shiping
    Guo, Wenzhong
    IEEE ACCESS, 2019, 7 : 33481 - 33493
  • [3] Joint co-clustering: Co-clustering of genomic and clinical bioimaging data
    Ficarra, Elisa
    De Micheli, Giovanni
    Yoon, Sungroh
    Benini, Luca
    Macii, Enrico
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (05) : 938 - 949
  • [4] Co-clustering under Nonnegative Matrix Tri-Factorization
    Labiod, Lazhar
    Nadif, Mohamed
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 709 - 717
  • [5] Weighted Nonnegative Matrix Tri-Factorization for Co-Clustering
    Li, Zhao
    Wu, Xindong
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 811 - 816
  • [6] Sleeved co-clustering of lagged data
    Shaham, Eran
    Sarne, David
    Ben-Moshe, Boaz
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 31 (02) : 251 - 279
  • [7] Co-clustering from Tensor Data
    Boutalbi, Rafika
    Labiod, Lazhar
    Nadif, Mohamed
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 370 - 383
  • [8] Co-clustering for binary and functional data
    Ben Slimen, Yosra
    Jacques, Julien
    Allio, Sylvain
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2022, 51 (09) : 4845 - 4866
  • [9] Sleeved co-clustering of lagged data
    Eran Shaham
    David Sarne
    Boaz Ben-Moshe
    Knowledge and Information Systems, 2012, 31 : 251 - 279
  • [10] Co-clustering of fuzzy lagged data
    Eran Shaham
    David Sarne
    Boaz Ben-Moshe
    Knowledge and Information Systems, 2015, 44 : 217 - 252