Semi-supervised Co-Clustering on Attributed Heterogeneous Information Networks

被引:9
|
作者
Ji, Yugang [1 ]
Shi, Chuan [1 ]
Fang, Yuan [2 ]
Kong, Xiangnan [3 ]
Yin, Mingyang [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] Singapore Management Univ, Singapore, Singapore
[3] Worcester Polytech Inst, Worcester, MA 01609 USA
[4] Alibaba Grp, Hangzhou, Peoples R China
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Co-clustering; Heterogeneous information network; Meta-paths; Matrix tri-factorization; Semi-supervised learning; MATRIX FACTORIZATION;
D O I
10.1016/j.ipm.2020.102338
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Node clustering on heterogeneous information networks (HINs) plays an important role in many real-world applications. While previous research mainly clusters same-type nodes independently via exploiting structural similarity search, they ignore the correlations of different-type nodes. In this paper, we focus on the problem of co-clustering heterogeneous nodes where the goal is to mine the latent relevance of heterogeneous nodes and simultaneously partition them into the corresponding type-aware clusters. This problem is challenging in two aspects. First, the similarity or relevance of nodes is not only associated with multiple meta-path-based structures but also related to numerical and categorical attributes. Second, clusters and similarity/relevance searches usually promote each other. To address this problem, we first design a learnable overall relevance measure that integrates the structural and attributed relevance by employing meta-paths and attribute projection. We then propose a novel approach, called SCCAIN, to co-cluster heterogeneous nodes based on constrained orthogonal non-negative matrix tri-factorization. Furthermore, an end-to-end framework is developed to jointly optimize the relevance measures and co-clustering. Extensive experiments on real-world datasets not only demonstrate that SCCAIN consistently outperforms state-of-the-art methods but also validate the effectiveness of integrating attributed and structural information for co-clustering.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Semi-supervised Clustering in Attributed Heterogeneous Information Networks
    Li, Xiang
    Wu, Yao
    Ester, Martin
    Kao, Ben
    Wang, Xin
    Zheng, Yudian
    PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, : 1621 - 1629
  • [2] Semi-Supervised Heterogeneous Fusion for Multimedia Data Co-Clustering
    Meng, Lei
    Tan, Ah-Hwee
    Xu, Dong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2293 - 2306
  • [3] SCHAIN-IRAM: An Efficient and Effective Semi-Supervised Clustering Algorithm for Attributed Heterogeneous Information Networks
    Li, Xiang
    Wu, Yao
    Ester, Martin
    Kao, Ben
    Wang, Xin
    Zheng, Yudian
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1980 - 1992
  • [4] Constraint Co-Projections for Semi-Supervised Co-Clustering
    Huang, Shudong
    Wang, Hongjun
    Li, Tao
    Yang, Yan
    Li, Tianrui
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (12) : 3047 - 3058
  • [5] Fuzzy semi-supervised co-clustering for text documents
    Yan, Yang
    Chen, Lihui
    Tjhi, William-Chandra
    FUZZY SETS AND SYSTEMS, 2013, 215 : 74 - 89
  • [6] A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble
    Zhang, Yinghui
    JOURNAL OF INTELLIGENT SYSTEMS, 2020, 29 (01) : 143 - 153
  • [7] Semi-supervised fuzzy co-clustering algorithm for document categorization
    Yan, Yang
    Chen, Lihui
    Tjhi, William-Chandra
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (01) : 55 - 74
  • [8] Semi-supervised fuzzy co-clustering algorithm for document categorization
    Yang Yan
    Lihui Chen
    William-Chandra Tjhi
    Knowledge and Information Systems, 2013, 34 : 55 - 74
  • [9] Topic detection in cross-media: a semi-supervised co-clustering approach
    Xue, Zhe
    Li, Guorong
    Zhang, Weigang
    Pang, Junbiao
    Huang, Qingming
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2014, 3 (03) : 193 - 205
  • [10] A Semi-supervised Fuzzy Co-clustering Framework and Application to Twitter Data Analysis
    Honda, Katsuhiro
    Ubukata, Seiki
    Notsu, Akira
    Takahashi, Norimitsu
    Ishikawa, Yutaka
    2015 4TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION ICIEV 15, 2015,